Capability Maturity Model Integration by Sally Godfrey (2008)  

Capability Maturity Model Integration by Sally Godfrey (2008)  

Source publication
Article
Full-text available
Any public administration that produces translation data can be a provider of useful reusable data to meet its own translation needs and the ones of other public organizations and private companies that work with texts of the same domain. These data can also be crucial to produce domain-tuned Machine Translation systems. The organization's manageme...

Context in source publication

Context 1
... our translation data provider maturity model we have taken into account the capability maturity model integration by Sally Godfrey (2008), which is shown in Figure 1 and Aymerich and Carmelo (2009) report on translation services at the PanAmerican Health Organization. ...

Citations

Chapter
Machine translation (MT) is special in that it heavily relies on data. In rule-based MT, an engine performs the translation task by using language resources such as dictionaries and grammar rules, usually written by experts, but sometimes learned from monolingual or bilingual text. Corpus-based (statistical and, more recently, neural) MT leverages large amounts of monolingual and sentence-aligned bilingual text. Clearly, MT programs using these data are works of creation that may be copyright-protected, but this chapter focuses on data. Human labour, and therefore, creative authorship of works, is present in all forms of MT data: monolingual text has been authored, parallel text has been translated and aligned, and rules and dictionaries have been written by experts. Since its conception centuries ago, copyright protects the livelihoods of authors by regulating how copies of these data can be used and how works derived from them are used and published, using instruments such as licences. While the case of dictionaries and grammars as used in rule-based MT is reasonably clear, as they are purposely written for one or another language-processing application, monolingual and parallel text, as used in MT, were not created with MT in mind, and this has led some authors to ask whether authors and translators should get additional compensation for this unintended use of their work to generate new value downstream. This chapter gives an overview of the different sources of data used in MT, discussing authorship along the steps of creating, curating and transforming those data for use with MT, determining the kinds of implicit and explicit licensing schemes that apply to them and how they work. It also describes the controversy surrounding the use of published works to generate new, initially unintended, value through translation technologies and the various ways in which copyright issues are addressed.
Chapter
Governments and organizations want to reap observed open data benefits like trust, participation, collaboration, transparency, anti-corruption, decreased bureaucracy, and improved organizational capacity and innovative practices. However, they face challenges during this transition since they need a holistic roadmap, including where to start and what to do to utilize the open data concept. To satisfy this need, we developed a theoretically grounded and methodologically rigorous process reference model for the open data domain to assess the current situation and provide a road map for improvements. The open data process reference model (OD-PRM), consisting of 23 open data-specific process definitions with a comprehensive perspective on the domain, is developed based upon the ISO/IEC 330xx family of standards. Owing to the OD-PRM, an organization's open data process capability and maturity levels can be assessed based on ISO/IEC 3300xx to provide a current level assessment and a roadmap for improvement to implement, use, maintain, and publish open data in a standardized manner.