Conference PaperPDF Available

Querying a Bioinformatic Data Sources Registry with Concept Lattices

Authors:

Abstract and Figures

Bioinformatic data sources available on the web are multiple and heterogenous. The lack of documentation and the difficulty of interaction with these data banks require users competence in both informatics and biological fields for an optimal use of sources contents that remain rather under exploited. In this paper we present an approach based on formal concept analysis to classify and search relevant bioinformatic data sources for a given user query. It consists in building the concept lattice from the binary relation between bioinformatic data sources and their associated metadata. The concept built from a given user query is then merged into the concept lattice. The result is given by the extraction of the set of sources belonging to the extents of the query concept subsumers in the resulting concept lattice. The sources ranking is given by the concept specificity order in the concept lattice. An improvement of the approach consists in automatic refinement of the query thanks to domain ontologies. Two forms of refinement are possible by generalisation and by specialisation.
Content may be subject to copyright.
ccsd-00000102 : 17 Dec 2002
Scanning Tunneling Microscopy in TTF-TCNQ :direct proof of
phase and amplitude modulated charge density waves
Z. Z. Wang,1J. C. Girard,1C. Pasquier,2D. erome,2and K. Bechgaard3
1Laboratoire de Photonique et de Nanostructures,
CNRS, route de Nozay, 91460, Marcoussis, France
2Laboratoire de Physique des Solides (CNRS),
Universit´e Paris-Sud, 91405, Orsay, France
3Polymer Science Department, Research Center Risœ, DK 4000, Roskilde, Denmark
(Dated: December 17, 2002)
Abstract
Charge density waves (CDW) have been studied at the surface of a cleaved TTF-TCNQ single
crystal using a low temperature scanning tunneling microscope (STM) under ultra high vacuum
(UHV) conditions, between 300K and 33K with molecular resolution. All CDW phase transitions
of TTF-TCNQ have been identified. The measurement of the modulation wave vector along the
adirection provides the first evidence for the existence of domains comprising single plane wave
modulated structures in the temperature regime where the transverse wave vector of the CDW is
temperature dependent, as hinted by the theory more than 20 years ago.
1
The discovery of the organic molecular crystal tetrathiafulvalene-
tetracyanoquinodimethane (TTF-TCNQ ), comprising weakly coupled one dimensional
(1D) molecular stacks, created a tremendous turmoil in 1973 [1, 2]. This was the first
molecular crystal to show a conductivity approaching that of conventional metals at room
temperature and exhibiting a metal-like behaviour on cooling. A partial charge is transfered
from TTF to TCNQ stacks and the charge density ρpotentially available for transport is
determined by the value of kFat which the bonding TCNQ band crosses the antibonding
TTF band, leading to 2kF=ρπ/b in the 1D band picture, where bis the unit cell length .
Between 54K and 38K, CDW’s successively develop in the TCNQ and TTF stacks. These
transitions have been ascribed to the instability of a one dimensional electronic gas due to
the Peierls mechanism, see reference [3] for a review.
When the CDW’s are active on both kinds of stacks frustration arises and the 2D ordered
superlattice can be described by plane waves with the wave vectors q+= [+qa(T), 2kF] or q
=[qa(T), 2kF]. Both lead to configurations which are energetically equivalent. The wave
vector q+gives rise to a charge modulation such as ρ(r) = ρ+cos(q+r+θ+) which is a CDW
of fixed amplitude and a phase varying like qaaalong the adirection, and similarly for the
qwave vector. Consequently, the diffraction pattern of the CDW state should display an
equal number of domains characterized by the vectors q+and q.
There also exists another possibility, namely: the superposition of the two plane waves
q+and qwhich leads to a CDW with constant phase but a modulated amplitude along the
adirection [4, 5], double-qconfiguration. The only solution which can take advantage of the
commensurability energy related to the transverse commensurate periodicity through the
fourth order Umklapp term in a Landau-Ginzburg expansion is the double-qconfiguration
[5]. This means that both wave vectors are simultaneously activated below 38 K with four
satellite spots at ±q+and ±qin the reciprocal space around a main Bragg spot. On the
other hand, it has been pointed out that the phase modulated solution should be the most
stable one in the incommensurate transverse wave vector temperature regime and also the
only one to provide a smooth onset at 49K [5]. The presence of a microscopic coexistence of
vectors q+and qbelow 38K has been shown by X-ray diffuse scattering [6] and a structural
determination [7]. However, in spite of the data of an early STM study of TTF-TCNQ
[8] showing a phase modulated 2D structure at 42 K there exists no direct evidence of a
transition from a phase modulated regime between 49 and 38 K where the temperature
2
dependence makes the amplitude of qasliding to an amplitude modulated situation below
38 K. Diffraction experiments performed on a bulk sample have failed to provide a clue since
they cannot tell the difference between an amplitude modulated configuration and one in
which the phase is modulated with an equal number of domains with q+and q. Therefore,
only those specific techniques like STM probing the sample locally are likely to provide an
answer to this problem.
The present STM investigation of a TTF-TCNQ single crystal has been performed in
a broad temperature range (33-300 K). The primary goal was to achieve the best possible
experimental conditions in order to provide local information regarding the development of
3D ordered CDW’s below 54K. This work brings the first direct experimental proof for the
existence of phase modulated and amplitude modulated CDW’s between 49-38 K and below
38 K respectively and also supports the model proposed by theoreticians more than 20 years
ago [4, 9].
The experiment was carried out in an UHV-LT-STM system with separate UHV chambers
for STM measurements, sample and tip preparation. The base pressure in each chamber
is in the range of 1011 mbar. A commercially available LT-STM head is used in this
study and the entire scanning unit (including tip, sample, piezo tube, piezo motor and
damping system) is inserted in a thermostat with four gold-plated cold shields (Omicron LT-
STM). The sample temperature is controlled by a Lake Shore DRC 91C controller. Typical
temperature fluctuations are less than 20 mK in 200 seconds with an average temperature
drift below 50 mK per hour. Mechanically sharpened Pt/Ir tips were used. The durability of
the tips has been demonstrated by their ability to get molecular resolution of TTF-TCNQ
for hours. The quality of the tips is checked by their ability to obtain atomic resolution
on a gold surface before and after imaging of TTF-TCNQ . We image the sample using a
constant current mode. The maximum data rate is 100 KHz and the typical time needed to
record one image is 200 seconds.
Crystals of TTF-TCNQ with nice looking natural faces and typical dimensions of 3 ×0.5
×0.05 mm3are selected for this experiment. A clean (001) surface is obtained by cleaving
the single crystal with a razor blade in air just before insertion. Direct exposure to air
is restricted to less than 2 minutes. In order to avoid micro-cracks in TTF-TCNQ while
cooling or warming, the temperature variation rate is kept at 1 K per minute.
Figure 1a displays a typical image of the a-b plane (area 5.3 ×5.3 nm2) obtained in
3
FIG. 1: a) STM image of the a-b plane of TTF-TCNQ taken at 63K.The image area is 5.3 nm ×
5.3 nm. b) shows the profile along a TCNQ stack indicated by black arrows.
a constant current mode (I=1 nA, V=50 mV) at 63 K where a 1D structure of parallel
chains is clearly observed with one set of chains containing a triplet of balls and the other a
doublet. According to the calibration of the piezo at low temperature, the distance between
similar chains is 1.22 nm and 0.38 nm between units along the chain direction, see fig.1b.
Both distances compare very well with the aand blattice constants, b= 0.3819 nm and
a= 1.229 nm [10]. We can ascribe the triplet feature in fig.1a to the TCNQ in agreement
with the early work of Sleator and Tycko [11]. The TTF molecule appears usually as a
single ball feature in STM imaging although reports of doublet structures have also been
made in the literature [12]. An extensive interpretation of the TTF-TCNQ image in the
absence of CDW will be given in a forthcoming paper[13]and the present work is restricted
to the physics provided by STM images of the TCNQ molecules only. No bias voltage
dependence (polarity) of the image was observed during our measurements in agreeement
with the expected conducting nature of the surface [14]. In the whole temperature domain
where the sample is metallic i.e. above 54K, images like fig.1a were observed and we could
not detect any modulation on the STM image besides that provided by the uniform TTF-
TCNQ lattice. Therefore, the periodic modulation along the TCNQ stacks reported in ref
[8] at 61K could be related to static CDW’s stabilized by defects or steps on the surface as
noticed by the authors.
4
FIG. 2: a) STM image of the a-b plane of TTF-TCNQ taken at 49.2K. The image area is 8.7 nm
x 11.9 nm, b) Fourier transformed pattern showing the 2a×3.39b CDW ordering.
Below 54K a two dimensional superstructure restricted to the TCNQ chains with a period
of 2a ×3.3b appears in the image (see fig.2a).
The modulation wavevector (shown in fig.2b by Fourier transforming the image) does
not vary down to 49K. On further cooling, the transverse modulation vector becomes in-
commensurate (IC) and a temperature dependence qa(T) is observed without noticeable
change along b, figs.3a,b. The Fourier transformed image shows that the modulation can
be described by a single wave vector q+or qin the temperature domain 49-38 K. However,
a transverse commensurability (×4) arises abruptly at 38 K. The ordering of the charge
density modulations both along aand bdirections at 36.5 K is presented in figs.3c,d. Be-
low 38K (low temperature commensurate phase) a double-qCDW modulation q+and qis
identified.
The images presented above are the first to report a study of the 2D superlattice structure
of TTF-TCNQ in real space below the Peierls transition down to the temperature of 33 K.
The value and temperature dependence of the modulation wave vector are in very good
agreement with the detailed X-ray [6, 15, 16] and neutron scattering [17, 18] reports (see
fig.4a).
We can provide a real space signature of the intermediate temperature regime in which
the transverse period is evolving with temperature (the sliding regime) before a lock-in takes
place at 38 K. Although the signal coming from the CDW modulation is always dominant
in all our scans (with a corrugation of 0.21 nm at 36.5 K along the TCNQ stacks) it does
5
FIG. 3: a) STM image of the a-b plane of TTF-TCNQ taken at 39K, the image area is 9.3 nm x 6.9
nm, b) Fourier transformed pattern showing the single-q CDW in the sliding temperature domain,
c) STM image of the a-b plane of TTF-TCNQ taken at 36.5K, the image area is 14.8×14.7 nm, d)
Fourier transformed pattern showing the double-q (4a×3.3b) CDW in the commensurate phase.
not overcome the corrugation coming from the underlying TCNQ lattice, namely 0.12 nm.
Thanks to the coexistence between CDW and original lattices on the images, molecular
resolution can be obtained in the CDW condensed state at low temperature.
This is at variance with layered compounds such as 1T-TaSe2where the image is domi-
6
FIG. 4: a) Temperature dependence of the CDW wavelengths along a(triangles) and b(open
squares) directions in unit cell dimensions. The large scattering of the data at T=40.6K were
taken from small images of 5 nm×5 nm while other temperatures were taken from images larger
than 10 nm×10 nm. The solid (dotted) lines are obtained from X-rays diffraction measurements
in warming (cooling) respectively. b) Cosine fit of the CDW profile at 36.5K along the adirection
indicated by black arrows in fig. 3c revealing the CDW phasing.
nated by the CDW superlattice but somewhat similar to the situation in 2H-NbSe2[19].
The very good agreement between the real space CDW features and the results from the
neutron scattering experiments shows that cleaved surfaces are highly ordered and retain
the electronic properties of the bulk material. A similar conclusion was reached in ARPES
experiments performed on cleaved (001) surfaces of TTF-TCNQ [20, 21]. The salient result
of this work is given in fig.3 which makes it clear that warming through the transverse
lock-in transition the modulation evolves from an amplitude modulation along a(double-q
superlattice) in the commensurate phase to a phase modulation in the incommensurate wave
vector regime with only a single-qvector activated over the investigated sample area.
Thus, we have shown that TTF-TCNQ adopts a domain structure in the temperature
regime where the transverse ordering of the CDW’s is incommensurate. This is probably
the clue to understand the hysteresis displayed by qa(T) between 49K and 38K [17, 18] as
suggested by [22, 23].
7
The fact that the CDW is observable by a STM probe shows that it is static in spite of
its incommensurate nature (along the bdirection), and is therefore pinned by impurities or
defects .
The low temperature CDW in TTF-TCNQ is thus an ideal candidate to study the local
phase shift for the following reasons: the unit cell in the a-b plane has a quadratic symmetry,
the CDW phase is commensurate in the adirection but incommensurate in the bdirection,
the CDW modulation is double-qmodulated below 38K so the phase shifts along aand b
can be studied separately and in addition a modulation of the amplitude along ais expected.
Furthermore, we notice on figs.4a,b that the phase of the CDW is such as to present
an alternation of the amplitude on the TCNQ stacks like + + ++, etc... along the a
direction.
The phasing of the CDW with respect to the underlying lattice below 38 K agrees with
the diffraction experiments data [15]. The results of our work show that STM techniques are
very well adapted to the local study of CDW’s in TTF-TCNQ and resolve the question of
phase against amplitude modulation. In addition, this work opens new ways towards a local
investigation of the pinning of the CDW’s around impurities to derive information about
the nature of the pinning mechanism (strong or weak).
We thank J.P.Pouget, K.Maki and E.Canadell for very fruitful discussions.Z.Z.Wang ac-
knowledges the financial support of the SESAME contract 1377.
[1] L.B.Coleman et al. Solid State.Comm, 12:1125, 1973.
[2] J.P.Ferraris et al. J.Am.Chem.Soc, 95:948, 1973.
[3] D.J´erome and H.J.Schulz. Advances in Physics, 31:299, 1982.
[4] A.Bjeli˘s and S.Bari˘si´c . Phys.Rev.Lett, 37:1517, 1976.
[5] E.Abrahams, J.Solyom, and F.Woynarovich. Phys.Rev.B, 16:5238, 1977.
[6] S.Kagoshima, T.Ishiguro, and H.Anzai. J.Phys.Soc.Japan, 41:2061, 1976.
[7] Y.Bouveret and S.Megtert. J.Physique.France, 50:1649, 1989.
[8] T.Nishiguchi et al. Phys.Rev.Lett, 81:3187, 1998.
[9] P.Bak and V.J.Emery. Phys.Rev.Lett, 36:978, 1976.
[10] T.J.Kistenmacher, T.E.Phillip, and D.O.Cowan. Acta Cryst B, 30:763, 1974.
8
[11] T.Sleator and R.Tycko. Phys.Rev.Lett, 60:1418, 1988.
[12] N.Kato et al. Nanotechnology, 7:122, 1996.
[13] P.Ordejon, E.Canadell, Y.J.Lee, and R.Nieminen. to be published.
[14] W.Sacks, D.Roditchev, and J.Klein. Phys.Rev.B, 57:13118, 1998.
[15] J.P.Pouget. in: Semiconductors and Semimetals, page 87. Academic Press, 1988.
[16] S.K.Khanna et al. Phys.Rev.B, 16:1468, 1977.
[17] W.D.Ellenson et al. Solid.State.Comm, 20:53, 1976.
[18] W.D.Ellenson et al. Phys.Rev.B, 16:3244, 1977.
[19] B.Giambattista et al. Phys.Rev.B, 37:2741, 1988.
[20] F.Zwick et al. Phys.Rev.Lett, 81:2974, 1998.
[21] R.Claessen et al. Phys.Rev.Lett, 88:096402, 2002.
[22] J.P.Pouget and R.Comes. in: Charge Density Waves in Solids, page 85. eds. L.P.Gork’ov and
G.Gr¨uner, Elsevier, 1989.
[23] S.Bari˘si´c and A.Bjeli˘s. in: Theoretical Aspects of Band Structure and Electronic Properties of
Pseudo-One-Dimensional Solids, page 49. D.Reidel PublishingCompany, 1985.
9
... Une contribution à l'indexation et la recherche d'information sémantique basée sur l'AFC a été proposée dans [Codocedo et al., 2012] et dans [Codocedo et al., 2013] les auteurs utilisent les pattern structures pour traiter des données plus complexes. Dans [Messai et al., 2005], les auteurs utilisent les treillis de concepts pour la découverte et l'interrogation de ressources génomiques sur le web et dans [Alam et al., 2013] une approche basée sur les treillis a été proposée pour l'organisation et l'accès aux données liées ouvertes dans le domaine de la biologie. ...
... Nous rappelons dans ce qui suit la définition de requête simple pour l'interrogation de treillis de concepts [Messai et al., 2005]. Nous gardons cette définition pour l'interrogation de notre structure relationnelle. ...
... Cette forme facilite l'insertion de la requête dans le treillis de concepts en utilisant un algorithme de construction incrémentale de treillis de concepts. Une telle insertion peut être considérée comme l'ajout d'une nouvelle entrée (un nouvel objet et ses attributs) dans le contexte formel considéré comme décrit dans la définition ci-dessous [Messai et al., 2005]. ...
Thesis
Full-text available
Une collection documentaire est généralement représentée comme un ensemble de documents mais cette modélisation ne permet pas de rendre compte des relations intertextuelles et du contexte d’interprétation d’un document. Le modèle documentaire classique trouve ses limites dans les domaines spécialisés où les besoins d’accès à l’information correspondent à des usages spécifiques et où les documents sont liés par de nombreux types de relations. Ce travail de thèse propose deux modèles permettant de prendre en compte cette complexité des collections documentaire dans les outils d’accès à l’information. Le premier modèle est basée sur l’analyse formelle et relationnelle de concepts, le deuxième est basée sur les technologies du web sémantique. Appliquées sur des objets documentaires ces modèles permettent de représenter et d’interroger de manière unifiée les descripteurs de contenu des documents et les relations intertextuelles qu’ils entretiennent.
... Une contribution à l'indexation et la recherche d'information sémantique basée sur l'AFC a été proposée dans [Codocedo et al., 2012] et dans [Codocedo et al., 2013] les auteurs utilisent les pattern structures pour traiter des données plus complexes. Dans [Messai et al., 2005], les auteurs utilisent les treillis de concepts pour la découverte et l'interrogation de ressources génomiques sur le web et dans [Alam et al., 2013] une approche basée sur les treillis a été proposée pour l'organisation et l'accès aux données liées ouvertes dans le domaine de la biologie. ...
... Nous rappelons dans ce qui suit la définition de requête simple pour l'interrogation de treillis de concepts [Messai et al., 2005]. Nous gardons cette définition pour l'interrogation de notre structure relationnelle. ...
... Cette forme facilite l'insertion de la requête dans le treillis de concepts en utilisant un algorithme de construction incrémentale de treillis de concepts. Une telle insertion peut être considérée comme l'ajout d'une nouvelle entrée (un nouvel objet et ses attributs) dans le contexte formel considéré comme décrit dans la définition ci-dessous [Messai et al., 2005]. ...
Thesis
Full-text available
Une collection documentaire est généralement représentée comme un ensemble de documents mais cette modélisation ne permet pas de rendre compte des relations intertextuelles et du contexte d'interprétation d'un document. Le modèle documentaire classique trouve ses limites dans les domaines spécialisés où les besoins d'accès à l'information correspondent à des usages spécifiques et où les documents sont liés par de nombreux types de relations. Ce travail de thèse propose deux modèles permettant de prendre en compte cette complexité des collections documentaire dans les outils d'accès à l'information. Le premier modèle est basée sur l'analyse formelle et relationnelle de concepts, le deuxième est basée sur les technologies du web sémantique. Appliquées sur des objets documentaires ces modèles permettent de représenter et d'interroger de manière unifiée les descripteurs de contenu des documents et les relations intertextuelles qu'ils entretiennent.
... Sur des données textuelles, (Carpineto et Romano, 2005) propose une méthode de recherche d'information par treillis de concepts. Dans (Messai et al., 2005), les auteurs ont utilisé les treillis de concepts pour la découverte et l'interrogation de ressources génomiques sur le web. ...
Book
Les travaux sur les ontologies ou les ressources sémantiques sont actifs dans les différentes communautés informatiques comme le Web, la bioinformatique, de domaine médical ou les systèmes d'informations géographiques. Ainsi, les ressources sémantiques comme les ontologies, les bases de données lexicales, les thésaurii, se développent et sont facilement disponibles. Les techniques de fouilles et d’extraction d’information permettent de construire, nettoyer et enrichir ces ressources sémantiques. L'atelier RISE est donc spécialement dédié à l'usage des techniques de fouilles pour développer des ressources sémantiques utilisées dans des systèmes de recherche d'Information. Ainsi, cet atelier a pour but de proposer un lieu de rencontre entre des chercheurs issus de différentes communautés comme l'extraction de connaissances, la recherche d'information, mais aussi le web sémantique et le traitement automatique des langues.
... Les questions des hydroécologues pourraient être traitées comme des requêtes à une base de données. Ce que l'AFC apporte est une organisation inhérente des réponses à leurs requêtes par regroupement hiérarchique (Messai et al., 2005;Azmeh et al., 2011 FIG. 3 -Attributs relationnels construits à partir des concepts Taxon et de la relation a_abondance entre SiteEchantillon et Taxon question d'un expert est trouver les sites échantillonnés en eau calme, le treillis à gauche de la figure 2 organise les réponses via C_SiteEchantillon_5 et ses sous-concepts. L'expert voit ainsi quelles réponses correspondent à sa question avec le minimum de caractéristiques additionnelles, alors qu'à mesure qu'il/elle descend dans le treillis, des caractéristiques sont ajoutées aux groupes. ...
... -Les treillis sont également utilisés en recherche d'information () [CR04,MDNST05]. L'utilisation de l' en  est, entre autres motivée par l'analogie évidente entre les associations objet/attribut de l' et document/terme en . Selon cette analogie, les concepts formels peuvent être considérés comme des classes de documents qui correspondent à une requête de l'utilisateur. ...
Thesis
Cette thèse porte sur l'utilisation d'ontologies et de bases de connaissances pour guider différentes étapes du processus d'extraction de connaissances à partir de bases de données (ECBD) et une application dans le domaine de la pharmacogénomique. Les données relatives à ce domaine sont hétérogènes, complexes, et distribuées dans diverses bases de données, ce qui rend cruciale l'étape préliminaire de préparation et d'intégration des données à fouiller. Je propose pour guider cette étape une approche originale d'intégration de données qui s'appuie sur une représentation des connaissances du domaine sous forme de deux ontologies en logiques de description : SNP-Ontology et SO-Pharm. Cette approche a été implémentée grâce aux technologies du Web sémantique et conduit au peuplement d'une base de connaissances pharmacogénomique. Le fait que les données à fouiller soient alors disponibles dans une base de connaissances entraîne de nouvelles potentialités pour le processus d'extraction de connaissances. Je me suis d'abord intéressé au problème de la sélection des données les plus pertinentes à fouiller en montrant comment la base de connaissances peut être exploitée dans ce but. Ensuite j'ai décrit et appliqué à la pharmacogénomique, une méthode qui permet l'extraction de connaissances directement à partir d'une base de connaissances. Cette méthode appelée Analyse des Assertions de Rôles (ou AAR) permet d'utiliser des algorithmes de fouille de données sur un ensemble d'assertions de la base de connaissances pharmacogénomique et d'expliciter des connaissances nouvelles et pertinentes qui y étaient enfouies.
... Le raffinement de requête consiste à reformuler une requête en y ajoutant des attributs à partir d'une ou de plusieurs ressources sémantiques (ontologies de domaine, thésaurus, taxonomies, etc.) [Messai et al., 2005, Messai et al., 2006a. Les attributs ajoutés doivent être en relation sémantiques avec ceux de la requête. ...
Thesis
Cette thèse porte sur l'exploitation des connaissances de domaine dans un processus de découvertes de sources de données biologiques sur le Web. Tout d'abord, des ensembles de métadonnées sont utilisés pour décrire le contenu et la qualité des sources de données. Ensuite, en s'appuyant sur ces métadonnées, les sources sont organisées dans un treillis de concepts en fonction de leurs caractéristiques communes. Le treillis de concepts constitue le support de la découverte de sources de données qui s'effectue de deux manières différentes et complémentaires : par navigation et par interrogation. Dans les deux cas la découverte de sources de données peut être guidée par des connaissances du domaine. Lors d'une découverte de sources de données par navigation, les connaissances sont utilisées soit pour réduire l'espace de recherche soit pour orienter la navigation vers des concepts sectionnés. Lors d'une découverte de sources de données par interrogation, les connaissances du domaine sont soit exprimées sous la forme de préférences entre métadonnées dans la requête soit utilisées pour l'enrichissement (ou reformulation) de la requête. Pour assurer une prise en compte des connaissances du domaine plus fidèle, nous avons introduit les treillis de concepts multivalués. L'organisation des sources de données sous la forme d'un treillis de concepts multivalués permet de contrôler la taille de l'espace de recherche et d'augmenter la flexibilité et les performances du processus de découverte dans ses deux modes. La navigation peut être effectuée dans des treillis de différents niveaux de spécialisation avec la possibilité d'effectuer des zooms dynamiques permettant le passage d'un treillis à l'autre. L'interrogation bénéficie d'une augmentation de l'expressivité dans les requêtes.
... Thus in [1] the retrieved documents are found using the distance from the concepts of the lattice to the " query concept " , i.e. the concept whose intent coincides with the query. In [11] the lattice-based IR techniques for classifying and searching relevant bioinformatic data sources is used to find the needed information exploring only the superconcepts of the query concept in the conceptual structure. In [4] it is proposed a FCA-based approach for semantic indexing and retrieval using the Galois lattice as a semantic index and as a search space to model terms. ...
Article
Full-text available
In this paper we describe an iterative approach based on formal concept analysis to refine the information retrieval process. Based on weights for ranking documents we define a weighted formal context. We use a Galois connection to introduce a new type of formal concept that allows us to work with specific thresholds for searching words in Web documents. By increasing the threshold, we obtain smaller lattices with more relevant concepts, thus improving the retrieval of more specific items. We use techniques for processing large data sets in parallel, to generate sequences of Galois lattices, overcoming the time complexity of building a lattice for an entire large context.
Article
Everyone who works in the legal field is faced with the complexity of documentary sources of law, that are highly interrelated and interdependent of each others. It is essential for legal practitioners to rely on systems that retrieve all the sources related to the legal cases they are working on and not only the most relevant ones. The challenge for legal IR is to achieve exhaustivity and handle this complexity by retrieving documents on the basis of the semantic content and the intertextual relationships. This work proposes an IR approach for legal sources that goes beyond existing systems. It is based on Formal and Relational Concept Analysis to structure, query and browse collections of legal documents.
Book
Full-text available
Relational Concept Analysis (RCA) constructs conceptual abstractions from objects described by both own properties and interobject links, while dealing with several sorts of objects. RCA produces lattices for each category of objects and those lattices are connected via relational attributes that are abstractions of the initial links. Navigating such interrelated lattice family in order to find concepts of interest is not a trivial task due to the potentially large size of the lattices and the need to move the expert's focus from one lattice to another. In this paper, we investigate the navigation of a concept lattice family based on a query expressed by an expert. The query is defined in the terms of RCA. Thus it is either included in the contexts (modifying the lattices when feasible), or directly classified in the concept lattices. Then a navigation schema can be followed to discover solutions. Different navigation possibilities are discussed.
Article
Full-text available
potentielles. L'utilisation de ces structures est illustrée par diverses d'applications: recherche documentaire, réutilisation, conception des hiérarchies de classes, génération de règles d'implication à partir de bases de données, acquisition et organisation des connaissances. ABSTRACT. Several structures used for conceptual clustering are presented in a uniform framework based on the notion of Galois lattices (concept lattices). The unified framework reveals the common aspects of the methods and leads to consider a variety of potential combinations. The usefulness of these structures is illustrated by several applications: information retrieval, software reuse, design of class hierarchies, data mining, knowledge acquisition and organization. MOTS-CLÉS: classification conceptuelle, formation de concepts, treillis de Galois, treillis de concepts, applications. KEY WORDS: conceptual clustering, concept formation, Galois lattice, concept lattice, applications.
Conference Paper
Full-text available
An incremental concept lattice construction algorithm, called AddIntent, is proposed. In experimental comparison, AddIntent outperformed a selection of other published algorithms for most types of contexts and was close to the most efficient algorithm in other cases. The current best estimate for the algorithm’s upper bound complexity to construct a concept lattice L whose context has a set of objects G, each of which possesses at most max(|g′|) attributes, is O(|L||G|2 max(|g′|)).
Article
Abstract. An incremental concept lattice construction algorithm, called AddIntent, is proposed. In experimental comparison, AddIntent outperformed a selection of other published algorithms for most types of contexts and was close to the most efficient algorithm in other cases. The current best estimate for the al-gorithm’s upper bound complexity to construct a concept lattice L whose context has a set of objects G, each of which possesses at most max(|g'|) attributes, is O(|L||G|2max(|g'|)).
Chapter
A complex concept lattice can possibly be split up into simpler parts. Here the mathematical model must prove its worth by providing efficacious and versatile methods for the decomposition. Every such decomposition principle can be reversed to make a construction method. Therefore, some of the following subjects will be taken up again in the next chapter with this second focus.
Chapter
Maps between concept lattices that can be used for structure comparison are above all the complete homomorphisms. In Section 3.2 we have worked out the connection between compatible subcontexts and complete congruences, i.e., the kernels of complete homomorphisms. A further approach consists in coupling the lattice homomorphisms with context homomorphisms. In this connection, it seems reasonable to use pairs of maps, i.e., to map the objects and the attributes separately. Those pairs can be treated like maps. We do so without further ado and write, for instance,$$(\alpha ,\beta ):(G,M,I) \to (H,N,J),$$if we mean a pair of maps \( \alpha :G \to H,\beta :M \to N, \) using the usual notations for maps by analogy. This does not present any problems, since in the case that \( G \cap M = + H \cap N \) we can replace such a pair of maps (α,β) by the map $$\alpha \cup \beta :G\dot \cup M \to H\dot \cup N
Article
Article
A lattice-based model for information retrieval has been suggested in the 1960's but has been seen as a theoretical possibility hard to practically apply ever since. This paper attempts to revive the lattice model and demonstrate its applicability in an information retrieval system, FaIR, that incorporates a graph-ical representation of a faceted thesaurus. It shows how Boolean queries can be lattice-theoretically related to the concepts of the thesaurus and visualized within the thesaurus display. An advantage of FaIR is that it allows for a high level of transparency of the system which can be controlled by the user.