Nacéra Bennacer SeghouaniLaboratoire Interdisciplinaire des Sciences du Numérique
Nacéra Bennacer Seghouani
Professor
About
66
Publications
9,129
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
390
Citations
Introduction
Nacéra Bennacer Seghouani is professor at Computer Science Department of CentraleSupélec and researcher member of LRI (Laboratoire de Recherche en Informatique). Nacéra's research topics are currently mainly focused on information extraction (multilingual text, unreconciled entities, structured data as graphs, ...), learning user's profiles in social networks from heterogenous data and large network analysis.
Publications
Publications (66)
Applying Transfer-Learning based on pre-trained language models has become popular in Natural Language Processing. In this paper, we present a weakly supervised Named Entity Recognition system that uses a pre-trained BERT model and applies two consecutive fine tuning steps. We aim to reduce the amount of human labour required for annotating data by...
In recent years there has been a growing interest in analyzing human behavioral data generated by new technologies. One type of digital footprint that is universal across the world, but that has received relatively little attention to date, is spending behavior.
In this paper, using the transaction records of 1306 bank customers, we investigated th...
Twitter is a social network that offers a rich and interesting source of information challenging to retrieve and analyze. Twitter data can be accessed using a REST API. The available operations allow retrieving tweets on the basis of a set of keywords but with limitations such as the number of calls per minute and the size of results. Besides, ther...
The definition of effective strategies for graph partitioning is a major challenge in distributed environments since an effective graph partitioning allows to considerably improve the performance of large graph data analytics computations. In this paper, we propose a multi-objective and scalable Balanced GRAph Partitioning (\algo) algorithm, based...
The definition of effective strategies for graph partitioning is a major challenge in distributed environments since an effective graph partitioning allows to considerably improve the performance of large graph data analytics computations. In this paper, we propose a multi-objective and scalable Balanced GRAph Partitioning (B-GRAP) algorithm to pro...
Schema matching is a critical problem in many applications where the main goal is to match attributes coming from heterogeneous sources. In this paper, we propose PROCLAIM (PROfile-based Cluster-Labeling for AttrIbute Matching), an automatic, unsupervised clustering-based approach to match attributes of a large number of heterogeneous sources. We d...
Schema matching is a critical problem in many applications where the main goal is to match attributes coming from heterogeneous sources. In this paper, we propose PROCLAIM (PROfile-based Cluster-Labeling for AttrIbute Matching), an automatic, unsupervised clustering-based approach to match attributes of a large number of heterogeneous sources. We d...
Since today’s online social media serve diverse purposes such as social and professional networking, photo and blog sharing, it is not uncommon for people to have multiple profiles across different social networks. Finding or reconciling these profiles would allow the creation of a holistic view of different facets of a person’s life that can be us...
Although social media platforms serve diverse purposes, from social and professional networking to photo sharing and blogging, people frequently use them to share the thoughts and opinions and most importantly, their interests (e.g., politics, economy, sports). Understanding the interests of social media users is key to many applications that need...
In this work, we propose HerM (Heterogeneous Distributed Model), a NoSQL data modeling approach which supports the use of multiple heterogeneous NoSQL systems in a distributed environment. We define the conceptual elements necessary for data modeling, and we identify optimized data distribution patterns. We implemented a flexible framework, where w...
Twitter is a social network that provides a powerful source of data. The analysis of those data offers many challenges among those stands out the opportunity to find reputation of a product, a person or any other entity of interest. Several approaches for sentiment analysis have been proposed in the literature to assess the general opinion expresse...
Social Networking Sites
, such as Twitter and LinkedIn, are clear
examples
of the
impact that the Web 2.0 has on people around the world, because they target an aspect of life that is extremely important to anyone: social relationships. The key to building a social network is the ability of finding people that we know in real life, which, in turn,...
Several studies have shown that the users of Twitter reveal their interests (i.e., what they like) while they share their opinions, preferences and personal stories.
Twitter is a social network that provides a powerful source of data. The analysis of those data offers many challenges among those stands out the opportunity to find the reputation of a product, of a person, or of any other entity of interest. Several tools for sentiment analysis have been built in order to calculate the general opinion of an entit...
Many Wikipedia articles that cover the same topic in different language editions are interconnected via cross-language links that enable the understanding of topics in multiple languages, as well as cross-language information retrieval applications. However, cross-language links are added manually by the users of Wikipedia and, as such, are often i...
In online social networks individuals are given the option to reveal on their online profiles some personal information about themselves including, among others, their home location that, if specified, is typically referred to with a toponym. A toponym disclosed by an individual on her profile, or self-reported toponym (e.g., "London"), is often am...
Wikipedia is a well-known public and collaborative encyclopaedia consisting of millions of articles. Initially in English, the popular website has grown to include versions in over 288 languages. These versions and their articles are interconnected via cross-language links, which not only facilitate navigation and understanding of concepts in multi...
This paper proposes an Ontology-driven and Community-based Web Services (OCWS) framework which aims at automating discovery, composition and execution of web services. The purpose is to validate and to execute a user’s request built from the composition of a set of OCWS descriptions and a set of user constraints. The defined framework separates cle...
Social Networking Sites, such as Facebook and LinkedIn, are clear examples of the impact that the Web 2.0 has on people around the world, because they target an aspect of life that is extremely important to anyone: social relationships. The key to building a social network is the ability of finding people that we know in real life, which, in turn,...
Avec l’arrivée du Web 2.0, on assiste à un foisonnement de services de réseautage social, qui mettent l’utilisateur au centre des préoccupations. Ces services permettent de partager des ressources (YouTube, Flickr, Del.icio.us), d’échanger des informations et de construire des relations personnelles ou pro- fessionnelles (Facebook, LinkedIn) ou enc...
The Linked Open Data initiative brought more and more
RDF data sources to be published on the Web. However, these data
sources contain relatively little information compared to the documents
available on the surface Web. Many annotation tools have been proposed
in the last decade for the automatic construction and enrichment
of knowledge bases. But...
Résumé : Grâce au Linked Open Data, les sources RDF mises à disposition sur le web sont de plus en plus nombreuses. Cependant, bien que ces sources soient volu-mineuses et évolutives, elles contiennent relativement peu d'information par compa-raison au volume d'informations contenues dans les documents semi-structurés. De nombreux outils ont pour o...
This paper proposes an Ontology-driven and Community-based Web Services (OCWS) framework which aims at automating discovery, composition and execution of web services. The purpose is to validate and to execute a user’s request built from the composition of a set of OCWS descriptions and a set of user constraints. The defined framework separates cle...
In the past few years, recommender systems and semantic web technologies have become main subjects of interest in the research community. In this paper, we present a domain independent semantic similarity measure that can be used in the recommendation process. This semantic similarity is based on the relations between the individuals of an ontology...
The constant growth of the Internet has made recommender systems very useful to guide users coping with a large amount of data. In this paper, we present a domain independent collaborative and semantic-based recommender system which uses distinct and complementary modules. The approach targets users with various interests and is based on: (i) a col...
This paper presents SHIRI-Querying, an approach for semantic search in semistructured documents. We propose a solution to tackle incompleteness and imprecision of annotations at querying time. This solution relies on two elementary reformulation types that use the notion of aggregation and the documents structure. We present the dynamic algorithm (...
Ce papier présente SHIRI-Querying, une approche pour la recherche sémantique de l'information dans les documents semi-structurés. Nous proposons une solution pour pallier l'incomplétude et l'imprécision des annotations au moment de l'interrogation. Cette solution repose sur deux types de reformulations élémentaires qui exploitent la notion d'agréga...
This paper presents SHIRI-Querying, an approach for semantic search in semistructured documents. We propose a solution to tackle incompleteness and imprecision of annotations at querying time. This solution relies on two elementary reformulation types that use the notion of aggregation and the documents structure. We present the dynamic algorithm (...
SHIRI 1 is an ontology-based system for integration of semi- structured documents related to a specic domain. The system's purpose is to allow users to access to relevant parts of documents as answers to their queries. SHIRI uses RDF/OWL for representation of resources and SPARQL for their querying. It relies on an automatic, unsupervised and ontol...
In this paper, we present a model of composition for semantic web services based on statechart. The environment is based on a distributed architecture and on semantic and uniform based-ontology and community service descriptions. Thus, the community constitutes a unified, a homogeneous and a consistent access interface for existing heterogeneous e-...
In this work, we present a framework for the semantic composition of web services based on Statecharts and uniform community service descriptions. Our model is a two step process. In the first step, we derive the execution model of the user's query. The execution model is specified in Statecharts formalism; whereas the user's query is described in...
In this paper, we present SHIRI-Annot, an automatic ontology- driven and unsupervised approach for the semantic annotation of doc- uments which contain more or less structured parts. The aim of this approach is to build an integration system called SHIRI 3 which allows the user access to documents related to a specific domain. In this sys- tem, the...
The increasing volume of data available on the Web makes information retrieval a tedious and difficult task. The vision of the Semantic Web introduces the next generation of the Web by establishing a layer of machine-understandable data, e.g., for software agents, sophisticated search engines and Web services. The success of the Semantic Web crucia...
In this paper, we present a semantic web services composition framework based on a distributed architecture and on semantic and uniform based- ontology service descriptions. Central to our work is the use of the concept of community of services. Community constitutes a unified, homogeneous and consistent access interface to existing and heterogeneo...
Relation extraction is a difficult open research problem with important applications in several fields such as knowledge management, web mining, ontology building, intelligent systems, etc. In our research, we focus on extracting relations among the ontological concepts in order to build a domain ontology. In this paper, firstly, we answer some cru...
In this paper, we focus on the ontological concept extraction and evaluation process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical clustering algorithm namely "Contextual Concept Discovery" (CCD) which is an incremental use of the partitioning algorithm Kmeans and is guided by a structural context. O...
In this paper, we present a semantic web services composition framework based on a distributed architecture and on semantic and uniform based-ontology service descriptions. Central to our framework is the use of the concept of community of services. This concept provides means to describe services with the same language based on the community ontol...
Résumé. Nous présentons une approche automatique d'enrichissement séman- tique de documents HTML qui exploite une description du domaine, plus préci- sément un ensemble de concepts, leurs propriétés, leurs relations, les cardinalités associées et leurs caractéristiques contextuelles pour enrichir sémantiquement le contenu des documents. Le processu...
Ontologies provide a common layer which plays a major role in supporting information exchange and sharing. In this paper, we focus on the ontological concept extraction process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical clustering algorithm namely "contextual ontological concept extraction" (COCE)...
Ontologies provide a common layer that plays a major role in information exchange and support sharing. Ontologies proliferation relies strongly on the automation of their building, integration and deployment processes. In this paper, we present an integrated framework involving complementary dimensions to drive the (semi) automatic acquisition conc...
Ontologies provide a common layer which plays a major role in supporting information exchange and sharing. In this paper,
we focus on the ontological concept extraction process from HTML documents. We propose an unsupervised hierarchical clustering
algorithm namely “Contextual Ontological Concept Extraction” (COCE) which is an incremental use of a...
Ontology mappings provide a common layer which allows distributed applications to share and to exchange semantic information.
Providing mechanized ways for mapping ontologies is a challenging issue and main problems to be faced are related to structural
and semantic heterogeneity. The complexity of these problems increases in the presence of spatio...
The increasing volume of data available on the Web makes information retrieval a tedious and difficult task. The vision of the Semantic Web introduces the next generation of the Web by establishing a layer of machine-understandable data e.g. for software agents, sophisticated search engines and Web services. The success of the Semantic Web cruciall...
Ontologies provide a common layer which plays a major role in sup- porting information exchange and sharing. Their proliferation relies strongly on the automation of ontology building, integration and deployment processes. In this paper we introduce an integrated framework involving different and com- plementary dimensions to drive the (semi) autom...
The interoperability problem arises in heterogeneous systems where difierent data sources coexist and there is a need for meaningful information sharing. One of the most representive realms of diversity of data representation is the spatio-temporal domain. Spatio-temporal data are most often described according to multiple and greatly diverse perce...
The World-Wide Web hosts many autonomous and heterogeneous information sources. In the near future each source may be described
by its own ontology. The distributed nature of ontology development will lead to a large number of local ontologies covering
overlapping domains. Ontology integration will then become an essential capability for effective...
This paper deals with the automation of ontology building process from HTML pages. Our methodology is based on the complementary use of two approaches. The first ap-proach is based on Aussenac-Gilles methodology and requires feedback from the user to pro-pose and refine concepts and their relationships. The second one exploits Web pages structure a...
In this paper we present an example of a part of a pedagogical ontology for a grande ecole. OWL is intended to help users to formalize ontologies and to be a support for the semantic Web, for example to enable the interchange of resources and the inference of knowledge while querying these resources.
The World Wide Web offers an increasing amount of complex and rich educational Web resources that are available for free in various domains. Unfortunately, it is difficult today to have a Web agent that answers precisely a simple query. Semantic Web aims to make Web resources meaningful to automated agents. Ontologies are proposed to provide a form...
Probabilistic validation is an approach for the validation of
highly dependable and complex systems. It relies on a partial analysis
on a system model and tries to prove that the failed event occurrences
has a sufficiently low probability. We define a probabilistic validation
method using worst event driven and an importance sampling simulation.
Th...
Classical validation approach tries to prove that failed events (events that do not verify an user property) will never occur. Probabilistic validation relies on a partial analysis on a system model and tries to prove that the failed event occurrences, have a sufficiently low probability. An incorrect behavior is a very rare event: it is the conseq...