Classification results for each set of language representations, using naive cross-validation where languages related to the evaluated language are not excluded from the training fold. The point of this figure is to demonstrate how unsound evaluation methods give misleading results, see main text for details.

Classification results for each set of language representations, using naive cross-validation where languages related to the evaluated language are not excluded from the training fold. The point of this figure is to demonstrate how unsound evaluation methods give misleading results, see main text for details.

Source publication
Preprint
Full-text available
To what extent can neural network models learn generalizations about language structure, and how do we find out what they have learned? We explore these questions by training neural models for a range of natural language processing tasks on a massively multilingual dataset of Bible translations in 1295 languages. The learned language representation...

Context in source publication

Context 1
... illustrate the effect of not following our cross-validation setup (Section 7.2), we now compare Figure 8a NMTeng2x and NMTx2eng), perform equally poorly in both cases, suggesting that they do not correlate well with any type of language similarity. For representations such as Lexical and ASJP, the naive cross-validation setup results in much higher classification F 1 than the linguistically sound cross-validation. ...