Tsvetana Dimitrova

Tsvetana Dimitrova
Bulgarian Academy of Sciences | BAS · Institute for Bulgarian Language (IBL)

Ph.D.

About

36
Publications
9,656
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
81
Citations
Introduction
Skills and Expertise
Additional affiliations
March 2010 - present
Bulgarian Academy of Sciences
Position
  • Professor (Assistant)
June 2008 - February 2010
Bulgarian Academy of Sciences
Position
  • Research Assistant
Education
August 2003 - April 2008
August 1996 - August 2001
Sofia University "St. Kliment Ohridski"
Field of study
  • Bulgarian Language and Literature

Publications

Publications (36)
Article
The article offers an approach for tracking the knowledge and skills for the use of verbs that are deemed part of the basic vocabulary of students in the initial stage of education through language tasks. There are 5 types of tasks that will be conducted in the form of an online game in 4 variants. They are aimed at researching basic competencies i...
Article
Full-text available
The study aims at presenting the predicatives of state in linguistic research. The existing descriptions of the predicatives expressing state are analyzed in the context of the semantic typology predicatives with a view of their structure and the scope of the semantic field to which they belong. Several classifications are considered that take into...
Article
Full-text available
The article traces back the formation of the clitic cluster in Bulgarian starting from the Old Church Slavonic through Middle Bulgarian up to the Early Modern Bulgarian and beyond. It offers a hypothetical two-layer structure of the cluster – with the main layer consisting of a (pronominal) core and a (verbal) periphery, and a secondary layer hosti...
Article
Full-text available
The article discusses the semantics and structure of the sentences introduced by the conjunction "макар" (although) drawing upon data excerpted from Bulgarian monuments dated between the 15th and 17th centuries. In this early period, "макар" was a newly introduced conjunction which later extended its use to become a widely used concessive conjuncti...
Conference Paper
Full-text available
We give a short survey of the clitic syntax and the placement of the general negation in Middle Bulgarian, with focus on Wallachian Bulgarian letters (ca. 1386 ―1509 AD). These issues are inter-dependent. The placement of the negation marker не in the clause has an impact over the clitic-internal ordering. Auxiliary clitics tend to be placed differ...
Article
Full-text available
The semantic classification of adjectives in the Bulgarian Wordnet: Towards a multiclass approach The paper presents an attempt at semantic classification of adjectives in the Bulgarian wordnet. Although designed for the Bulgarian wordnet, the classification can be applied to other wordnets which are developed in parallel to the Princeton WordNet....
Conference Paper
Full-text available
The paper presents an overview of an attempt at the semantic classification of adjectives in the Bulgarian Wordnet based on the information that is already available in WordNet, and other classifications proposed in the literature (classifications in the linguistic literature for Bulgarian and approaches implemented by other wordnets, more precisel...
Conference Paper
Full-text available
The paper presents some preliminary observations on the classification of the adjectives in WordNet for a discussion on the principles applied. The insights support a work-in-progress on the development and introduction of a more detailed classification of the adjectives in the (Bulgarian) WordNet for enriching it with further information about the...
Conference Paper
Full-text available
This paper presents the extraction, representation and management of metadata in the Bulgarian National Corpus. We briefly present the current state of the Corpus and the general principles on which its development lies: uniformity, diversity of text samples, automatic compilation, extensive metadata, multi-layered linguistic annotation. The releva...
Conference Paper
Full-text available
This paper presents a machine learning method for automatic identification and classification of morphosemantic relations (MSRs) between verb and noun synset pairs in the Bulgarian WordNet (BulNet). The core training data comprise 6,641 morphosemantically related verb–noun literal pairs from BulNet. The core data were preprocessed quality-wise by a...
Conference Paper
Full-text available
This paper presents a web interface for wordnets named Hydra for Web which is built on top of Hydra – an open source tool for wordnet development – by means of modern web technologies. It is a Single Page Application with simple but powerful and convenient GUI. It has two modes for visualisation of the language correspondences of searched (and foun...
Conference Paper
Full-text available
This paper presents Hydra for Web – a web interface for wordnets (and lexical-semantic databases with similar relational structure). Hydra for web is built on top of Hydra – an open source tool for wordnet development – and is a single page application with a simple GUI. It has two modes – single and parallel – for visualisation of the language cor...
Conference Paper
Full-text available
In the context of developing wordnets and using them in various applications, we have been enriching the Romanian and Bulgarian resources with morphosemantic relations that can aid broadening the wordnet content and improving the possible NLP applications. In this paper, we build on our previous results, adding to our presentation data from English...
Conference Paper
Full-text available
This paper presents work in progress on a machine learning method for classification of morphosemantic relations between verb and noun synsets. The training data comprises 5,584 verb–noun synset pairs from the Bulgarian WordNet, where the morphosemantic relations were automatically transferred from the Princeton Word-Net morphosemantic database. Th...
Article
Full-text available
In this article, we trace the diachronic phases of so-called genitive-dative syncretism in Old Bulgarian, a phenomenon which marks the beginning of the process of disintegration of the Case system in the history of Bulgarian. We base our research on a corpus study (comprising the texts of Codex Marianus, Codex Zographensis and Codex Suprasliensis)...
Conference Paper
Full-text available
This paper demonstrates how historical corpora can be used in researching language phenomena. We exemplify the advantages and disadvantages through exploring three of the available corpora that contain textual sources of Old and Middle Bulgarian language to shed light on some aspects of the development of two words of ambiguous class. We discuss th...
Article
Full-text available
The paper motivates a strategy for identification and annotation of derivational relations in the Bulgarian wordnet that aims at coping with the complex morphology of the language in an elegant way. Our method involves transfer of the Princeton WordNet (morpho)semantic relations into the Bulgarian wordnet, at the level of the synset, and further de...
Article
Full-text available
The paper discusses several key concepts related to the development of corpora and reconsiders them in light of recent developments in NLP. On the basis of an overview of present-day corpora, we conclude that the dominant practices of corpus design do not utilise adequately the technologies and, as a result, fail to meet the demands of corpus lingu...
Conference Paper
Full-text available
The paper presents the partially automatically annotated and fully manually validated Bulgarian-English Sentence- and Clause-Aligned Corpus. The discussion covers the motivation behind the corpus development, the structure and content of the corpus, illustrated by statistical data, the segmentation and alignment strategy and the tools used in the c...
Conference Paper
Full-text available
The paper presents a new resource light flexible method for clause alignment which combines the Gale-Church algorithm with internally collected textual information. The method does not resort to any pre-developed linguistic resources which makes it very appropriate for resource light clause alignment. We experiment with a combination of the method...
Book
"The Old Bulgarian Noun Phrase: Towards an Annotation Specification" addresses the issue of application of modern linguistic approaches to historical language data in a corpora-oriented approach. The study combines a linguistic analysis of Old Bulgarian diachronic data with a proposal for an annotation specification for the nominal categories. Ca....

Network

Cited By