ChapterPDF Available

Abstract

Advanced translation workbenches with detailed logging and eye-tracking capabilities greatly facilitate the recording of key strokes, mouse activity, or eye movement of translators and post-editors. The large-scale analysis of the resulting data logs, however, is still an open problem. In this chapter, we present and evaluate a statistical method to segment raw keylogging and eye-tracking data into distinct Human Translation Processes (HTPs), i.e., phases of specific human translation behavior, such as orientation, revision, or pause. We evaluate the performance of this automatic method against manual annotation by human experts with a background in Translation Process Research.
A preview of the PDF is not available
... Similarly, Läubli (2014) conceptualizes human translation activities as hidden Markov processes. In order to train those models, the UAD is automatically segmented into vectors of observations. ...
Book
Full-text available
This edited volume covers an array of the most relevant topics in translation cognition, taking different approaches and using different research tools. It explores theoretical and methodological issues using case studies and examining their practical and pedagogical implications. It is a valuable resource for translation studies scholars, graduate students and those interested in translation and translation training, enabling them to conceptualize translation cognition, in order to enhance their research methods and designs, manage innovations in their translation training or simply understand their own translation behaviours.
... Similarly, Läubli (2014) conceptualizes human translation activities as hidden Markov processes. In order to train those models, the UAD is automatically segmented into vectors of observations. ...
Chapter
Full-text available
Translation process research (TPR) has advanced in the recent years to a state which allows us to study “in great detail what source and target text units are being processed, at a given point in time, to investigate what steps are involved in this process, what segments are read and aligned and how this whole process is monitored” (Alves 2015, p. 32). We have sophisticated statistical methods and with the powerful tools to produce a better and more detailed understanding of the underlying cognitive processes that are involved in translation. Following Jakobsen (2011), who suspects that we may soon be in a situation which allows us to develop a computational model of human translation, Alves (2015) calls for a “clearer affiliation between TPR studies and a particular cognitive sciences paradigm” (p. 23).
... Once predictions can be made and validated or rejected, existing models can be modified and/or extended. a promising start into such research queries has been presented, for instance, by Läubli and Germann (2016). ...
Chapter
Full-text available
This chapter tracks the development of translation models from the late 1940s to the beginning of this century. Models of the translation process have been developed within computational linguistics (CL) for machine translation (MT) and within translation studies (TS) to model the process of human translation (HT). It points out the similarities and differences between these translation models, and suggests future avenues for the development of computational models of the translation process. The chapter discusses a number of translation models that incorporate elements from a wide range of theories from cognitive, communication, and social sciences. Those models constitute a large inventory of translation-relevant qualities, but are not yet ready for rigorous quantitative assessment. The chapter also briefly traces the history of empirical investigations into the translation process, starting from think-aloud protocols (TAPs) in the 1990s, over corpus analyses, to more recent techniques of eye tracking and keylogging.
Article
The translation process can be studied as sequences of activity units. The application of machine learning technology offers researchers new possibilities in the study of the translation process. This research project developed a program, activity unit predictor, using the Hidden Markov Model. The program takes in duration, translation phase, target language and fixation as the input and produces an activity unit type as the output. The highest prediction accuracy reached is 61%. As one of the first endeavors, the program demonstrates strong potential of applying machine learning in translation process research.
Article
Full-text available
The use of neural machine translation (NMT) in a professional scenario implies a number of challenges despite growing evidence that, in language combinations such as English to Spanish, NMT output quality has already outperformed statistical machine translation in terms of automatic metric scores. This article presents the result of an empirical test that aims to shed light on the differences between NMT post-editing and translation with the aid of a translation memory (TM). The results show that NMT post-editing involves less editing than TM segments, but this editing appears to take more time, with the consequence that NMT post-editing does not seem to improve productivity as may have been expected. This might be due to the fact that NMT segments show a higher variability in terms of quality and time invested in post-editing than TM segments that are ‘more similar’ on average. Finally, results show that translators who perceive that NMT boosts their productivity actually performed faster than those who perceive that NMT slows them down.
Chapter
Full-text available
The three measurements for post-editing effort as proposed by Krings (2001) have been adopted by many researchers in subsequent studies and publications. These measurements comprise temporal effort (the speed or productivity rate of post-editing, often measured in words per second at the segment level), technical effort (the number of actual edits performed by the post-editor, sometimes approximated using the Translation Edit Rate metric (Snover et al. 2006), again usually at the segment level), and cognitive effort. Cognitive effort has been measured using think-aloud protocols, pause measurement, and, increasingly, eye-tracking. This chapter provides a review of studies of post-editing effort using eye-tracking, noting the influence of publications by Danks et al. (1997), and O'Brien (2006, 2008), before describing a single study in detail. The detailed study examines whether predicted affort indicators affect post-editing effort and results were previously published as Moorkens et al. (2015). This chapter focuses instead on methodology and the logistics of running an eye-tracking study recording over 70 sessions. Most of the eye-tracking data analysed were unused in the previous publication, and the small amount presented was not described in detail due to space constraints. In this study average fixation count per segment correlates very strongly with temporal effort, and average fixation duration correlates strongly with technical effort, a result that we compare with other studies of post-editing effort.
Article
Full-text available
Machine-translated segments are increasingly included as fuzzy matches within the translation-memory systems in the localisation workflow. This study presents preliminary results on the correlation between these two types of segments in terms of productivity and final quality. In order to test these variables, we set up an experiment with a group of eight professional translators using an on-line post-editing tool and a statistical-based machine translation engine. The translators were asked to translate new, machine-translated and translation-memory segments from the 80-90 percent value range using a post-editing tool without actually knowing the origin of each segment, and to complete a questionnaire. The findings suggest that translators have higher productivity and quality when using machine-translated output than when processing fuzzy matches from translation memories. Furthermore, translators' technical experience seems to have an impact on productivity but not on quality.
Thesis
Full-text available
This study presents empirical research on no-match, machine-translated and translation-memory segments, analyzed in terms of translators’ productivity, final quality and prior professional experience. The findings suggest that translators have higher productivity and quality when using machine-translated output than when translating on their own, and that the productivity and quality gained with machine translation are not significantly different from the values obtained when processing fuzzy matches from a translation memory in the 85-94 percent range. The translators’ prior experience impacts on the quality they deliver but not on their productivity. These quantitative findings are triangulatedwith qualitative data from an online questionnaire and from one-to-one debriefings with the translators.
Article
Whenever the quality provided by a machine translation system is not enough, a human expert is required to correct the sentences provided by the machine translation system. In this environment, the human translator is generating bilingual data after each translation has been marked as correct, and expects the system to be able to learn from the errors made. In this paper, we analyse the appropriateness of discriminative ridge regression for adapting the scaling factors of a state-of-the-art machine translation system within a conventional post-editing scenario and also within an interactive machine translation setup. Results show that the strategies applied in the former setup cannot be directly applied in the latter framework. Hence, the discriminative ridge regression is revised and adapted for the interactive machine translation framework, with encouraging results.