Experimental designs: (A) The human Predictability was estimated from the online responses of several participants to a web cloze-task experiment. Each participant had to complete one of every 30 words, and the text was uncovered as they responded. (B) Eye movements were recorded in separate participants that read three of the eight texts in the lab. The eye movement measures (Gaze duration) were analyzed using Linear Mixed Models. (C) Computational algorithms were trained on a large corpus of texts from a similar domain as the tested short stories (A,B). Image sources (B) R project (https://www.r-project.org/logo/, The R Foundation, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0/, no changes made).

Source publication

Figure 1. Experimental designs: (A) The human Predictability was...

Figure 2. Analysis of the cloze-Predictability. It co-varied with...

Figure 3. LMM analysis of the Gaze duration. (A) Different models...

Human and computer estimations of Predictability of words in written language

Article

Full-text available

Mar 2020

When we read printed text, we are continuously predicting upcoming words to integrate information and guide future eye movements. Thus, the Predictability of a given word has become one of the most important variables when explaining human behaviour and information processing during reading. In parallel, the Natural Language Processing (NLP) field...

Context 1

... was first estimated by humans' responses to a cloze-task. Approximately 2500 participants read 1-8 texts (mean = 1.92) and completed approximately 300 words out of 26366 unique words, where each participant completed one every 30 words in an online platform (Fig. 1A). Correlations between (logit) cloze-Predictability with the repetition number, the (log) frequency in a corpus, and the (inverse of) word length (Fig. 2) showed the expected behaviour (i.e., the more frequent, the shorter, and the more repeated the words were inside the text, the more predictable the words were) 19 ...

View in full-text

Context 2

... separate set of 36 participants performed an eye-tracking experiment in the lab. Each participant read three of the eight texts. Texts were assigned to participants pseudo-randomly (Fig. 1B). Finally, we trained different computational models drawn from the Natural Language Processing (NLP) field in a larger corpus. This corpus was also composed of short stories in Spanish. The original stories were not contained in the larger corpus (Fig. ...

View in full-text

Context 3

... read three of the eight texts. Texts were assigned to participants pseudo-randomly (Fig. 1B). Finally, we trained different computational models drawn from the Natural Language Processing (NLP) field in a larger corpus. This corpus was also composed of short stories in Spanish. The original stories were not contained in the larger corpus (Fig. ...

View in full-text

Context 4

... estimations for predictability were evaluated one at a time by replacing the cloze-Predictability (M2.N to M4.N). We first explored the parameter space for the N-gram+cache predictabilities, and we decided to use N = 4, δ = 0.00015 and λ = 0.15 (see Supplementary Fig. S1). The resulting co-variable was included in the model (M2.N, N-gram+cache model), which showed a very significant contribution (Fig. 3A, Supplementary Table S1). ...

View in full-text

Context 5

... a model that summed all these results was implemented, which used the N-gram+cache for the fixated word and CS-FT for N + 1 (M9 .N + 1, Fig. 4A, Supplementary Table S2). This model resulted in an AIC close to the best of all the explored models with only two co-variables included over the baseline model (Fig. 4C) and significantly better than M0.N + 1 and M1.N + 1 (Fig. 4D). ...

View in full-text

Context 6

... that developed at the level of the integration of new information with information from the beginning of the text. We hypothesized that these differences, both in the estimation of Predictability and the in eye movements, were the reason for the negative relation between cloze-Predictability of the following word (N + 1) and GD on word N (Fig. 4, M1.N + 1). This negative relation was found previously only in Chinese sentence-reading 26 , but not in German or Spanish sentence-reading 10,25 ...

View in full-text

Context 7

... we analyzed computer estimations of word Predictability with four different algorithms: N-gram, LSA, Word2Vec, and FastText. A 4-gram model was used with the addition of the local word frequency (see Supplementary Fig. S1). LSA, Word2Vec, and FastText were studied using 300 dimensions and average Cosine Similarity (CS) with the previous words (context) as a proxy for Predictability, which used different context sizes (see Supplementary Fig. S2). ...

View in full-text

Context 8

... estimation of the impact of these algorithms was analyzed using Linear Mixed Models (LMMs) and the Gaze Duration as the dependant variable (Fig. 3, Supplementary Table S1). The results of each of these computer-based Predictabilities on the gaze models clearly showed that the one that best explained eye movements was the N-gram+cache, even though it generated a large decrease in the frequency effect, presumably because of the high correlation between these two variables. In comparison with ...

View in full-text

Context 9

... N-gram probability for each word in the stories from the Buenos Aires Corpus was calculated using the SRILM package (http://www.speech.sri.com/projects/srilm/). The window used to determine the context (N) was optimized using the correlation with the cloze-Predictability (Supplementary Fig. S1A). The optimal value for N was 4, after which the curve showed a plateau, which indicated that long chains of words did not appear in our training corpus. ...

View in full-text

Context 10

... δ and λ parameters were optimized for the 4-gram probabilities. We performed a grid search between δ = [0, 000050, 0005] and λ = [0,050,6], measuring the t-value of the 4-gram+cache variable in the M2.N model (Supplementary Fig. 1B). We kept the values of δ and λ with the maximum absolute t-value (δ = 0, 00015 and λ = 0, 15). ...

View in full-text

A discriminative information-theoretical analysis of the regularity gradient in inflectional morphology

Article

Full-text available

Aug 2023

Over the last decades, several independent lines of research in morphology have questioned the hypothesis of a direct correspondence between sublexical units and their mental correlates. Word and paradigm models of morphology shifted the fundamental part-whole relation in an inflection system onto the relation between individual inflected word forms and inflectional paradigms. In turn, the use of artificial neural networks of densely interconnected parallel processing nodes for morphology learning marked a radical departure from a morpheme-based view of the mental lexicon. Lately, in computational models of Discriminative Learning, a network architecture has been combined with an uncertainty reducing mechanism that dispenses with the need for a one-to-one association between formal contrasts and meanings, leading to the dissolution of a discrete notion of the morpheme. The paper capitalises on these converging lines of development to offer a unifying information-theoretical, simulation-based analysis of the costs incurred in processing (ir)regularly inflected forms belonging to the verb systems of English, German, French, Spanish and Italian. Using Temporal Self-Organising Maps as a computational model of lexical storage and access, we show that a discriminative, recurrent neural network, based on Rescorla-Wagner’s equations, can replicate speakers’ exquisite sensitivity to widespread effects of word frequency, paradigm entropy and morphological (ir)regularity in lexical processing. The evidence suggests an explanatory hypothesis linking Word and paradigm morphology with principles of information theory and human perception of morphological structure. According to this hypothesis, the ways more or less regularly inflected words are structured in the mental lexicon are more related to a reduction in processing uncertainty and maximisation of predictive efficiency than to economy of storage.

Synthetic Predictabilities from Large Language Models Explain Reading Eye Movements

Conference Paper

Full-text available

May 2023

A long tradition in eye movement research has focused on three linguistic variables explaining fixation durations during sentence reading: word length, frequency, and predictability. Lengths and frequencies are easily obtainable but predictabilities are tedious to collect, requiring the incremental cloze procedure. Modern large language models are trained using the objective of predicting the next word given previous context, hence they readily provide predictability information. This capability has largely been overlooked in eye movement research. Here we investigate the suitability of a synthetic predictability measure, extracted from pretrained GPT-2 models, as a surrogate for cloze predictability. Using several published eye movement corpora, we find that synthetic and cloze predictabilities are highly correlated, and that their influence on eye movements is qualitatively similar. Similar patterns are obtained when including synthetic predictabilities in data sets lacking cloze predictabilities. In conclusion, synthetic predictabilities can serve as a substitute for empirical cloze predictabilities.

NLP Advances in Risk Analysis Context: Application of Quantum Computing

Conference Paper

Jan 2023

Language Models Explain Word Reading Times Better Than Empirical Predictability

Article

Full-text available

Feb 2022

Though there is a strong consensus that word length and frequency are the most important single-word features determining visual-orthographic access to the mental lexicon, there is less agreement as how to best capture syntactic and semantic factors. The traditional approach in cognitive reading research assumes that word predictability from sentence context is best captured by cloze completion probability (CCP) derived from human performance data. We review recent research suggesting that probabilistic language models provide deeper explanations for syntactic and semantic effects than CCP. Then we compare CCP with three probabilistic language models for predicting word viewing times in an English and a German eye tracking sample: (1) Symbolic n-gram models consolidate syntactic and semantic short-range relations by computing the probability of a word to occur, given two preceding words. (2) Topic models rely on subsymbolic representations to capture long-range semantic similarity by word co-occurrence counts in documents. (3) In recurrent neural networks (RNNs), the subsymbolic units are trained to predict the next word, given all preceding words in the sentences. To examine lexical retrieval, these models were used to predict single fixation durations and gaze durations to capture rapidly successful and standard lexical access, and total viewing time to capture late semantic integration. The linear item-level analyses showed greater correlations of all language models with all eye-movement measures than CCP. Then we examined non-linear relations between the different types of predictability and the reading times using generalized additive models. N-gram and RNN probabilities of the present word more consistently predicted reading performance compared with topic models or CCP. For the effects of last-word probability on current-word viewing times, we obtained the best results with n-gram models. Such count-based models seem to best capture short-range access that is still underway when the eyes move on to the subsequent word. The prediction-trained RNN models, in contrast, better predicted early preprocessing of the next word. In sum, our results demonstrate that the different language models account for differential cognitive processes during reading. We discuss these algorithmically concrete blueprints of lexical consolidation as theoretically deep explanations for human reading.

Language Models Explain Word Reading Times Better Than Empirical Predictability

Preprint

Full-text available

Feb 2022

Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions

Conference Paper

Full-text available

Oct 2021

Modern Natural Language Processing (NLP) models can achieve great results resolving different types of linguistic tasks. This is possible thanks to a high volume of internal parametersthat are optimized during the training phase. They allow to model high-level linguistic properties. For example, LSTM-based language models have the ability to find long-term dependencies between words on a text, and use them to make predictions about upcoming words. Nevertheless, their complexity makes it hard to understand which features they use to generate predictions. The neurolinguistic field faces a similar issue when studying how our brain processes language. For example, every adult reader has the ability to understand long texts and to make predictions of upcoming words. Nevertheless, our understanding on how these predictions are driven is limited. During the last decades, the study of eye movements during reading have shed some light on this topic, finding a relation between the time spent on a word (gaze duration) and its processing cost. Here, we aim to understand how LSTM-based models predict future words and these predictions relate with human predictions, fitting statistical models commonly used in the neurolinguistic field with gaze duration as the dependent variable. We found that an AWD-LSTM Language Model can partially model eye movements, with high overlap with both human-Predictability and lexical frequency. Interestingly, this last overlap is seen to depend on the training corpus, being lower when the model is fine-tuned with a corpus similar to the one used for testing.

Spintronic Sensors Based on Magnetic Tunnel Junctions for Wireless Eye Movement Gesture Control

Article

Full-text available

Sep 2020
IEEE T BIOMED CIRC S

The tracking of eye gesture movements using wearable technologies can undoubtedly improve quality of life for people with mobility and physical impairments by using spintronic sensors based on the tunnel magnetoresistance (TMR) effect in a human–machine interface. Our design involves integrating three TMR sensors on an eyeglass frame for detecting relative movement between the sensor and tiny magnets embedded in an in-house fabricated contact lens. Using TMR sensors with the sensitivity of 11 mV/V/Oe and ten <1 mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> embedded magnets within a lens, an eye gesture system was implemented with a sampling frequency of up to 28 Hz. Three discrete eye movements were successfully classified when a participant looked up, right or left using a threshold-based classifier. Moreover, our proof-of-concept real-time interaction system was tested on 13 participants, who played a simplified Tetris game using their eye movements. Our results show that all participants were successful in completing the game with an average accuracy of 90.8%.

Spintronic Sensors Based on Magnetic Tunnel Junctions for Wireless Eye Movement Gesture Control

Article

Full-text available

Jan 2020

Neural bases of predictions during natural reading of known statements: An EEG and eye movements co-registration study

Article

Full-text available

Mar 2023
NEUROSCIENCE

Predictions of incoming words performed during reading have an impact on how the reader moves their eyes and on the electrical brain potentials. Eye tracking (ET) experiments show that less predictable words are fixated for longer periods of times. Electroencephalography (EEG) experiments show that these words elicit a more negative potential around 400ms (N400) after the word onset when reading one word at a time (foveated reading). Nevertheless, there was no N400 potential during the foveated reading of previously known sentences (memory-encoded), which suggests that the prediction of words from memory-encoded sentences is based on different mechanisms than predictions performed on common sentences. Here, we performed an ET-EEG co-registration experiment where participants read common and memory-encoded sentences. Our results show that the N400 potential disappear when the reader recognises the sentence. Furthermore, time-frequency analyses show a larger alpha lateralisation and a beta power increase for memory-encoded sentences. This suggests a more distributed attention and an active maintenance of the cognitive set, in concordance to the predictive coding framework.

Evaluating Semantic Similarity Methods to Build Semantic Predictability Norms of Reading Data

Chapter

Aug 2021

Predictability corpora built via Cloze task generally accompany eye-tracking data for the study of processing costs of linguistic structures in tasks of reading for comprehension. Two semantic measures are commonly calculated to evaluate expectations about forthcoming words: (i) the semantic fit of the target word with the previous context of a sentence, and (ii) semantic similarity scores that represent the semantic similarity between the target word and Cloze task responses for it. For Brazilian Portuguese (BP), there was no large eye-tracking corpora with predictability norms. The goal of this paper is to present a method to calculate the two semantic measures used in the first BP corpus of eye movements during silent reading of short paragraphs by undergraduate students. The method was informed by a large evaluation of both static and contextualized word embeddings, trained on large corpora of texts. Here, we make publicly available: (i) a BP corpus for a sentence-completion task to evaluate semantic similarity, (ii) a new methodology to build this corpus based on the scores of Cloze data taken from our project, and (iii) a hybrid method to compute the two semantic measures in order to build predictability corpora in BP.

Contexts in source publication

Citations