Figure 2 - uploaded by Ciprian Chelba
Content may be subject to copyright.
A word-parse k-prefix A complete parse-Figure 3-is a binary parse of the (<s>, SB) (w 1 , t 1 ). .. (w n , t n ) (</s>, SE) sequence with the following two restrictions:

A word-parse k-prefix A complete parse-Figure 3-is a binary parse of the (<s>, SB) (w 1 , t 1 ). .. (w n , t n ) (</s>, SE) sequence with the following two restrictions:

Source publication
Article
Full-text available
A new language model for speech recognition inspired by linguistic analysis is presented. The model develops hidden hierarchical structure incrementally and uses it to extract meaningful information from the word history - thus enabling the use of extended distance dependencies - in an attempt to complement the locality of currently used n-gram Mar...

Similar publications

Article
Full-text available
GPT-3 is a large-scale natural language model developed by OpenAI that can perform many different tasks, including topic classification. Although researchers claim that it requires only a small number of in-context examples to learn a task, in practice GPT-3 requires these training examples to be either of exceptional quality or a higher quantity t...
Conference Paper
Full-text available
Models, modelling languages, modelling frameworks and their background have dominated conceptual modelling research and information systems engineering for last four decades. Conceptual models are mediators between the application world and the implementation or system world. Design science distinguishes the relevance cycle as the iterative process...
Preprint
Full-text available
Working memory is a critical aspect of both human intelligence and artificial intelligence, serving as a workspace for the temporary storage and manipulation of information. In this paper, we systematically assess the working memory capacity of ChatGPT (gpt-3.5-turbo), a large language model developed by OpenAI, by examining its performance in verb...
Preprint
Full-text available
This study investigates the transformative impact of ChatGPT, a cutting-edge AI language model developed by OpenAI, on the financial sector. We explore ChatGPT's applications in financial analysis, focusing on its ability to perform tasks traditionally handled by human analysts. By creating a set of multi-step and advanced reasoning financial tasks...
Preprint
Full-text available
ChatGPT is a cutting-edge artificial intelligence language model developed by OpenAI, which has attracted a lot of attention due to its surprisingly strong ability in answering follow-up questions. In this report, we aim to evaluate ChatGPT on the Grammatical Error Correction (GEC) task, and compare it with commercial GEC product (e.g., Grammarly)...

Citations

... Syntactic language modeling, to the best of our knowledge, could be dated back to Chelba (1997). Charniak (2001) and Clark (2001) propose to utilize a top-down parsing mechanism for language modeling. ...
Conference Paper
Full-text available
Variational auto-encoders (VAEs) are widely used in natural language generation due to the regularization of the latent space. However , generating sentences from the continuous latent space does not explicitly model the syntactic information. In this paper, we propose to generate sentences from disentangled syntactic and semantic spaces. Our proposed method explicitly models syntactic information in the VAE's latent space by using the linearized tree sequence, leading to better performance of language generation. Additionally, the advantage of sampling in the disentangled syntactic and semantic latent spaces enables us to perform novel applications, such as the un-supervised paraphrase generation and syntax-transfer generation. Experimental results show that our proposed model achieves similar or better performance in various tasks, compared with state-of-the-art related work.
... Syntactic language modeling, to the best of our knowledge, could be dated back to Chelba (1997). Charniak (2001) and Clark (2001) propose to utilize a top-down parsing mechanism for language modeling. ...
Preprint
Full-text available
Variational auto-encoders (VAEs) are widely used in natural language generation due to the regularization of the latent space. However, generating sentences from the continuous latent space does not explicitly model the syntactic information. In this paper, we propose to generate sentences from disentangled syntactic and semantic spaces. Our proposed method explicitly models syntactic information in the VAE's latent space by using the linearized tree sequence, leading to better performance of language generation. Additionally, the advantage of sampling in the disentangled syntactic and semantic latent spaces enables us to perform novel applications, such as the unsupervised paraphrase generation and syntax-transfer generation. Experimental results show that our proposed model achieves similar or better performance in various tasks, compared with state-of-the-art related work.
... Results presented in the paper show that a low order Markov assumption leads to the same accuracy as parsing with no Markov assumption, but with large efficiency gain in terms of computational complexity. In ( Chelba and Jelinek, 2000), a structured LM based on syntactic structure analysis capable of extracting meaningful information from the word history is developed. In this model, syntactical analysis is applied during speech decoding for partial parsing (and tagging) of the recognition output. ...
Article
Full-text available
Speech is the most natural way of human communication and in order to achieve convenient and efficient human–computer interaction implementation of state-of-the-art spoken language technology is necessary. Research in this area has been traditionally focused on several main languages, such as English, French, Spanish, Chinese or Japanese, but some other languages, particularly Eastern European languages, have received much less attention. However, recently, research activities on speech technologies for Czech, Polish, Serbo-Croatian, Russian languages have been steadily increasing.
... He was able to reduce both sentence and word error rates on the ATIS corpus using this method. The structured language model (SLM) used in Chelba and Jelinek (1998a, 1998b, 1999, Jelinek and Chelba (1999), and Chelba (2000) is similar to that of Goddeau, except that (i) their shift-reduce parser follows a nondeterministic beam search, and (ii) each stack entry contains, in addition to the nonterminal node label, the headword of the constituent. The SLM is like a trigram, except that the conditioning words are taken from the tops of the stacks of candidate parses in the beam, rather than from the linear order of the string. ...
... Perplexity Trigram Baseline Model Interpolation, A = .36 Chelba and Jelinek (1998a) 167.14 158.28 Chelba and Jelinek (1998b) 167 count of the conditioning words in the training corpus, and maximum likelihood mixing coefficients were calculated for each bin, to mix the trigram with bigram and unigram estimates. Our trigram model performs at almost exactly the same level as theirs does, which is what we would expect. ...
Article
This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic topdown parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. A new language model that utilizes probabilistic top-down parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models, demonstrating the degree to which the information captured by our parsing model is orthogonal to that captured by a trigram model. A small recognition experiment also demonstrates the utility of the model
... The exploitation of contextual knowledge is a key to successful OCR. In speech recognition the importance of contextual and linguistic constraints have been recognized long ago [7,1]. Also in machine printed character recognition, linguistic knowledge has been successfully applied [6,5]. ...
Article
Full-text available
In this paper we present a new database for off-line handwriting recognition, together with a few preprocessing and text segmentation procedures. The database is based on the Lancaster-Oslo/Bergen(LOB) corpus. This corpus is a collection of texts that were used to generate forms, which subsequently were filled out by persons with their handwriting. Up to now (December 1998) the database includes 556 forms produced by approximately 250 different writers. The database consists of full English sentences. It can serve as a basis for a variety of handwriting recognition tasks. The main focus, however, is on recognition techniques that use linguistic knowledge beyond the lexicon level. This knowledge can be automatically derived from the corpus or it can be supplied from external sources. Keywords: handwriting recognition, database, unconstrained English sentences, corpus, linguistic knowledge 1 Introduction Standard databases have become very important in handwriting recognition research...
Conference Paper
Sentiment analysis on Chinese health forums is challenging because of the language, platform, and domain characteristics. Our research investigates the impact of three factors on sentiment analysis: sentiment polarity distribution, language models, and model settings. We manually labeled a large sample of Chinese health forum posts, which showed an extremely unbalanced distribution with a very small percentage of negative posts, and found that the balanced training set could produce higher accuracy than the unbalanced one. We also found that the hybrid approaches combining multiple language model based approaches for sentiment analysis performed better than individual approaches. Finally we evaluated the effects of different model settings and improved the overall accuracy using the hybrid approaches in their optimal settings. Findings from this preliminary study provide deeper insights into the problem of sentiment analysis on Chinese health forums and will inform future sentiment analysis studies.
Article
As previously introduced, the Structured Language Model (SLM) operated with the help of a stack from which less probable sub-parse entries were purged before further words were generated. In this article we generalize the CKY algorithm to obtain a chart which allows the direct computation of language model probabilities thus rendering the stacks unnecessary. An analysis of the behavior of the SLM leads to a generalization of the Inside–Outside algorithm and thus to rigorous EM type re-estimation of the SLM parameters. The derived algorithms are computationally expensive but their demands can be mitigated by use of appropriate thresholding.
Chapter
Artificial intelligence (AI) focuses on getting machines to do things that we would call intelligent behavior. Intelligence – whether artificial or otherwise – does not have aprecise definition, but there are many activities and behaviors that are considered intelligent when exhibited by humans and animals. Examples include seeing, learning, using tools, understanding human speech, reasoning, making good guesses, playing games, and formulating plans and objectives. AI focuses on how to get machines or computers to perform these same kinds of activities, though not necessarily in the same way that humans or animals might do them.