Conference PaperPDF Available

An Emperical Framework of Idioms Translator From Bengali to English: Rule Based Approach

January 2020

January 2020

DOI:10.1109/TENSYMP50017.2020.9230738

Conference: 2020 IEEE Region 10 Symposium (TENSYMP)

Authors:

Chittagong University of Engineering & Technology

Md Gulzar Hussain

Changzhou University

Md. Jahidul Islam

Green University of Bangladesh

Green University of Bangladesh

Show all 5 authorsHide

Idioms are taking a vital part in effective communication as well as a crucial part of cultural inheritance. It represents the group of words together have the meaning which is different from an individual word meaning, for this metaphorical behavior idioms arise difficulties in the general machine translation system. In this paper, we have proposed a framework for translating Bengali to English. Context sensitive grammar rules are created for parsing. The top-down algorithm is used for parsing the sentences. We have proposed an algorithm for translating idioms in sentences. The proposed system is implemented and tested with about 15000 sentences. The performance analysis of the system gives 85.33% accuracy, which is quite satisfactory.

Workflow of proposed system

Representation of Bangla parse tree

Representation of English parse tree

Translation of the sentence"এই সমােজ বৃ লােকরা অচল পয়সা"

Figures - uploaded by Md Gulzar Hussain

Content may be subject to copyright.

Content uploaded by Md Gulzar Hussain

Content may be subject to copyright.

Content uploaded by Md Gulzar Hussain

Content may be subject to copyright.

Content uploaded by Md Gulzar Hussain

Content may be subject to copyright.

Content uploaded by Md Gulzar Hussain

Content may be subject to copyright.

Content uploaded by Md Gulzar Hussain

Content may be subject to copyright.

2020 IEEE Region 10 Symposium (TENSYMP), 5-7 June 2020, Dhaka, Bangladesh

An Emperical Framework of Idioms Translator

From Bengali to English: Rule Based Approach

Ayesha Khatun∗, Md Gulzar Hussain†, Md Jahidul Islam‡, Sumaiya Kabir§, Md Mahin¶

Department of Computer Science & Engineering,

Green University of Bangladesh, Dhaka, Bangladesh.

ayeshankhatun@gmail.com∗, gulzar.ace@gmail.com†, jahidul.jnucse@gmail.com‡,

summa.cse@gmail.com§, mahin@cse.green.edu.bd¶

Abstract—Idioms are taking a vital part in effective com-

munication as well as a crucial part of cultural inheritance.

It represents the group of words together have the meaning

which is different from an individual word meaning, for this

metaphorical behavior idioms arise difﬁculties in the general

machine translation system. In this paper, we have proposed a

framework for translating Bengali to English. Context sensitive

grammar rules are created for parsing. The top-down algorithm

is used for parsing the sentences. We have proposed an algorithm

for translating idioms in sentences. The proposed system is imple-

mented and tested with about 15000 sentences. The performance

analysis of the system gives 85.33% accuracy, which is quite

satisfactory.

Keywords—Bangla Machine Translator; Idioms; Bangla Lan-

guage Processing (BLP); Left corner parsing algorithm.

I. INTRODUCTION

An Idiom is a commonly used word or sentence that implies

something other than its metaphorical sense. Idioms convey a

speciﬁc feeling and a speciﬁc tone for a language. Due to their

common use, idioms can be recognized. Machine Translation

(MT) relates to the application of computers, which is capable

of translating the source language into target languages. This

process generally does not have any human intervention.

The MT model follows three main phases of parsing,

transferring and generation. But our idiom translator follows

four stages, which are idiom translator, parser, transfer, and

generation. Idiom translator checks the idiom part in the sen-

tence and translates it. Parser gathers the syntactic information

of the sentence using Context Free Grammars (CFG). In the

transfer stage, rules are transferred from source language to

target language. And ﬁnally, the targeted sentence is generated

in the generation stage. As idioms do not signify the literal

meaning of the words used, it is hard to translate idioms from

source language to target language.

Native Bangla speakers are growing day after day by

speaking and hearing idioms. It also implies for native English

speakers. Idiom plays a vital role in the culture of different

language speakers. In this modern age, it’s very important

to share knowledge and culture between different regions.

But due to language barrier Bangladeshi’s are not getting the

advantage of learning the various culture. To overcome this

barrier Bangla Language Processing can play an important

role. Our Idiom translator will be able to help Bangali people

to understand idioms in the English language, which will help

them to adopt their culture and break the cultural barrier.

The rest of the paper is organized as follows: Section II

discusses related works. Methodology is discussed in Section

III and it illustrates a sample following our proposed method-

ology. Section IV demonstrates the result and discussion and

ﬁnally Section V refers the conclusion.

II. RE LE TE D WORK

Research on the processing of natural language started in the

1950s. In the late 1980s, the ﬁrst statistical machine translation

systems were developed [1]. Till now many works are done

in English language. Authors of [2] developed a Japanese-

English machine translation system which was supported by

the Japanese government’s science and technology agency.

The system applies many structural transformations during the

transfer phase and generation phase to relieve the structural

difference of the same contents and avoid ellipsis problems.

Machine translation from Bangla language to other lan-

guages is in initial step now. Many works are done recently

on Bangla to English or vice versa. A phrase-based Statistical

Machine Translation (SMT) approach is proposed in [3]. In

their work Out-of-Vocabulary (OOV) words are also handled.

Authors of [4] proposed a rule-based transfer approach. They

proposed an algorithm for searching the word from the lexicon

and searching lexicon is made efﬁcient by an intelligent

integer based lexicon system. NLP techniques used to translate

English to Bangla sentences in [5]. The context-free grammar

used to validate the syntactical structure of a sentence and

bottom-up approach is used to parse sentences. They used 50

sentences for every tense. In [6] they proposed a verb based

machine translation approach for English to Bangla. They

identiﬁed the main verb and make a simple form of English

sentence. Then they easily translate it into Bangla. Authors

of [7] also proposed context-sensitive grammar to translate

Bangla to English. A new technique with a set of context-

sensitive grammar rules is proposed to parse any Bangla

sentences with imperative, optative and exclamatory Bangla

sentences in [8] where moods got importance than the structure

of sentence. Authors of [9] work to ﬁnd the appropriate verb

according to the tense and subject. A procedure for ﬁnding

semantically valid verb is proposed. They worked with verb

root and different algorithms are proposed in this paper.

978-1-7281-7366-5/20/$31.00 ©2020 IEEE

Maximum MT systems translate Bangla sentences to corre-

sponding English sentences but we found only one of them

includes idioms [10]. This paper presents, in addition to

English, a multi lingual parallel idiom data set for seven Indian

languages, and shows its relevance for two NLP applications.

A set of CSG rules is proposed for our MT system to translate

Bangla sentences with idioms to it’s corresponding English

sentence. Maximum work does not show the architecture of

procedure of translation idioms and work with fewer data.

In this system we proposed an architecture for translating

sentences.

III. PROP OS ED ME TH OD OL OG Y

In this propose system we have ten modules, the modules

are idioms checker, idioms translator, tokenizer, rule gener-

ator, database, parser, target language rules, source language

rules, machine translator and generator. Firstly, we consider a

Bengali sentence ”এই সমােজ বৃ লােকরা অচল পয়সা” as input

of the system. Step by step procedure is given in Fig. 1.

Fig. 1. Workﬂow of proposed system

A. Tokenizer

The main task of the tokenizer module is to split sentences

into unit strings. It is like a database system of words with

corresponding Parts of Speech (POS) tag. Suppose for the

input sentence ”এই সমােজ বৃ লােকরা অচল পয়সা ”, the output

will be like "এই”,“সমােজ”, “বৃ” ,“লােকরা”, “অচল", "পয়সা”.

After tokenizing the sentence, tokens will be going to idioms

checker.

B. Idioms Checker

The main task of idioms checker is to check the idioms

in the sentence by using Idioms checker algorithm which is

Algorithm 1. In idioms dataset when wi=অচল, where অচল

also ﬁnd in idioms dataset di, then it will ﬁnd the next word

wi+1 =পয়সা, then concat the string k = stringConcat(অচল,

পয়সা). Now idioms di= is equal to ki+1 as idioms found in

dataset so it will go to the next step Idioms translator if it does

not ﬁnd any term then it will concat the string up to i = 5 and

then go to parser. If the sentence contains any idioms, then it

will go to the idioms translator. For example,"এই সমােজ বৃ

লােকরা অচল পয়সা" as "অচল পয়সা" is an idiom, it will go to

the idioms translator module.

Algorithm 1: Algorithm for Idioms Checker

1. If wiis equal to split of di;

2. Find wi+1;

3. Function mPairWord(w1, w2, .....wn);

4. k =function stringConcat(w1, w2, .....wn);

5. for i= 0 to idioms dataset length do

if k== dithen

go to 6;

break;

else

go to 7;

end

end

6. go to idioms Translator module;

7. go to Parser module;

C. Idioms Translator

This translator translates the idioms into its original mean-

ing. As the idioms checker ﬁnd that the sample input sentence

has idioms“অচল পয়সা”, after that this module translates

the idioms into its corresponding meaning“মূলহীন”. After

translating idioms, it goes to parser module as shown in Fig

2.

Fig. 2. Module of Idioms Translator

D. Database

Database module is just like a dictionary which contains the

lexicon or token of a sentence and the related POS tag. For

example, in this sentence the pos tag of corresponding words

are,“এই” →PN, “সমােজ” →N, “বৃ”→Adj, “লােকরা”→N,

“মূলহীন”→Adj. In this system, it has another table which

has a set of Bangla idioms and its meaning. Table I shows the

Idioms Table.

TABLE I

BAN GLA IDIOMS TAB LE

Idioms (di) Meaning (mi)

অচল পয়সা মূলহীন

অকালক

ুা অপদাথ

ইচ

ঁেড় পাকা অকালপ

উম-মধম হার

এলািহ কা িবরাট বাপার

E. Rule Generator

The main purpose of the rule generator module is to

generate the grammatical rules of Bangla sentences. For trans-

lating, the sentences, this module generates Context-Sensitive

Grammar (CSG) rules. For this input sentence and this is built

with the help of rules, these sentences need those rules NP

→N (Biv) (Adj), S →NP VP, NP →(Qnt) (PP) N — PN

TABLE II

BAN GLA CSGS RUL ES

Rule No Bangla CSGs Rules

1 S →NP VP

2 NP →N (Biv) (Adj)

3 NP →N (Aux) (PP)

4 NP →NP NP

5 NP →(PN) N (Biv) (Adj)

6 NP →(Adj) N (Biv)

7 NP →(Qnt) (PP) N — PN

8 NP →N

9 PP →Null

10 V →Null

11 VP →V

12 VP →(Adj)

13 VP →(NP) VP

14 VP →V (Aux)

15 Adj →বৃ, ভাল, অমূল, খারাপ, . . . .

16 PN →এই, আিম, আপিন, ত

ুিম, . . . . .

17 N→চার, হার, লাক, সমাজ,. . . . .

18 V→হয়, ছাড়ল, পড়া, খাওয়া, . . . .

19 Biv →টােক, এরা, এ, . . . .

20 Aux →িদেয়, পের, কের, . . . . .

21 Qnt →একিট, পাচিট, . . .

for generating the parse tree. Sample CSG of Bangla simple

sentences is listed in Table II.

F. Parser

Graphical view of the grammatical structure of the sentence

is called the parse tree. Parser module helps to generate the

parse tree of a sentence by using CSG rules and lexicon. We

used left corner parsing algorithm to parse the sentence. This

module generates the parse tree for the input sentence“এই

সমােজ বৃ লােকরা মূলহীন” which is shown in Fig. 3.

Fig. 3. Representation of Bangla parse tree

G. Transfer

The task of the transfer module is to translate Bangla

sentence to English language. The grammatical rule for trans-

forming of grammar rule is listed in Table III. Using this

grammar rules and transformation algorithm, we can get parse

tree of English sentence, which is shown in Fig. 4.

The transformation process is divided into two part, rule

transfer and lexicon transfer. The process of transforming

grammar from source to target or from target to source

language is shown in Table IV.

TABLE III

ENG LIS H CSGS RULE S

Rule No English CSGs Rules

1 S →NP VP

2 S →VP NP

3 NP →NP NP

4 NP →Det N

5 NP →(PP) N (Adv)

6 NP →(PP) (PN) (Det) N

7 NP →Adj N

8 NP →Qnt N

9 NP →N

10 NP →PN

11 NP →(Aux) N

12 VP →V

13 VP →V (Adj)

14 VP →VP NP

15 VP →V (Gr) (N) (Adj)

16 VP →Aux V

17 N →thief, beating, society, person,

18 PN →this, that, I, She,...

19 V →release, are, like, eat, go,...

20 Adj →old, priceless, bad, good, ...

21 Aux →do, are, is,..

22 PP →in, on, to,..

23 Det →the, a, an,

24 Gr →ing

Fig. 4. Representation of English parse tree

TABLE IV

TRANSFORMATION OF TARGET TO SOURCE OR VICE VERSA

IV. EXPERIMENTAL RESULT

To assess the efﬁciency of our proposed system, we have

evaluated the system with about 15000 distinct types of

sentences with distinct sentence lengths. We collected these

sentences from various books, websites, Bangla grammar

books, Bangla text books etc.

A. Implementation

For executing the system, we used, Windows 10 as the

operating system, Java Swing to build the user interface, Java

as the programming language, and NetBeans 8.2 as IDE. The

snapshot of our implemented proposed MT system for the

sentence “এই সমােজ বৃ লােকরা অচল পয়সা” with idioms is

given in Fig. 5 where Google translator do not show the

appropriate transformation, given in Fig. 6.

Fig. 5. Translation of the sentence“এই সমােজ বৃ লােকরা অচল পয়সা”

Fig. 6. Translation of the sentence “এই সমােজ বৃ লােকরা অচল পয়সা”

in Google translator

B. Accuracy Rate

We observed that among 15000 sentences, a total of 12800

sentences were correctly translated with our proposed model.

The accuracy rate is the ratio of the correctly translated

sentences and the total number of sentences. Table V shows

the accuracy rate for the sentences with different lengths. A

graph of the system’s accuracy rate vs. the sentence length

is shown in Fig. 7. From this graph, we can observe that the

accuracy rate is decreasing where the length of the sentences

are increasing.

TABLE V

ACCURACY RATE OF DIFFERENT SENTENCES WITH

DIFFERENT LENGTH

Sentences

Length

No of

input

sentences

Correctly

translated

sentences

Overall ac-

curacy (%)

3 3500 3300 94.24

4 3250 2850 87.69

5 3100 2650 85.48

6 2750 2150 78.18

7 2400 1850 77.08

Total 15000 12800 85.33

Fig. 7. Accuracy vs. word length graph

C. Comparison Analysis

Comparison with paper [10] of our proposed method is

given in Table VI. Some parameters such as application, Em-

phasize, Feature, Accuracy etc. are shown in that comparison.

In Table VI we can see that XML markup language used as

feature in paper [10] where in our system rule-based approach

is used which is more appropriate than XML markup language.

TABLE VI

COMPARISON BETWEEN PAPER [10] AND OUR PROPOSED

SYSTEM

Paper [10] [R. Agrawal,

2018] Our proposed system

Application MT, Sentimental Analysis MT

Emphasize Indian Languages Only Bangla Language

Feature XML markup Rule Based

Accuracy 2.69% BLEU score 85.33% for 0.015

million corpora

Dataset

(Idioms) 2208 for 7 languages 986 for Bangla language

V. CONCLUSION

Aim of our paper is to translate different Bangla sentences

containing idioms to its corresponding English sentences. The

idea was to design a proper parsing technique to parse Bangla

sentences with idioms. Our proposed algorithm is able to

detect the idioms and translate it to its corresponding English

meaning. The experimental result shows, our technique gives

the accuracy of 85.33%. Our system might not get the exact

parse tree for some sentences. To evaluate our implemented

parsing model, we choose very simple and short Bangla

sentences. It is possible to design a stronger parser for Bangla

sentences to update CSG rules. These can be done by using

semantic features for further research.

ACKNOWLEDGEMENT

This work has been ﬁnancially supported by Green Univer-

sity of Bangladesh Research Fund.

REF ER EN CES

[1] Wikipedia. (2019) Natural language processing. [Online]. Available:

https://en.wikipedia.org/wiki/Natural language processing

[2] M. Nagao, J. Tsujii, and J. Nakamura, “Machine translation from

japanese into english,” Proceedings of the IEEE, vol. 74, no. 7, pp.

993–1012, July 1986.

[3] M. Z. Islam, J. Tiedemann, and A. Eisele, “English to bangla phrase-

based machine translation,” in Proceedings of the 14th Annual confer-

ence of the European Association for Machine Translation, 2010.

[4] M. G. R. Alam, M. M. Islam, and N. Islam, “A new approach to develop

an english to bangla machine translation system,” Daffodil International

University Journal of Science and Technology, vol. 6, no. 1, pp. 36–42,

2011.

[5] K. Muntarina, M. G. Moazzam, and M. A.-A. Bhuiyan, “Tense based

english to bangla translation using mt system,” International Journal of

Engineering Science Invention, vol. 2, no. 10, pp. 30–38, 2013.

[6] M. Rabbani, K. M. R. Alam, and M. Islam, “A new verb based approach

for english to bangla machine translation,” in 2014 International Con-

ference on Informatics, Electronics & Vision (ICIEV). IEEE, 2014, pp.

1–6.

[7] M. S. Areﬁn, L. Alam, S. Sharmin, and M. M. Hoque, “An empirical

framework for parsing bangla assertive, interrogative and imperative sen-

tences,” in 2015 International Conference on Computer and Information

Engineering (ICCIE). IEEE, 2015, pp. 122–125.

[8] T. Alamgir and M. S. Areﬁn, “An empirical framework for parsing

bangla imperative, optative and exclamatory sentences,” in 2017 In-

ternational Conference on Electrical, Computer and Communication

Engineering (ECCE). IEEE, 2017, pp. 164–169.

[9] M. Haque and M. Hasan, “English to bengali machine translation:

An analysis of semantically appropriate verbs,” in 2018 International

Conference on Innovations in Science, Engineering and Technology

(ICISET). IEEE, 2018, pp. 217–221.

[10] R. Agrawal, V. C. Kumar, V. Muralidharan, and D. M. Sharma, “No

more beating about the bush: A step towards idiom handling for indian

language nlp,” in Proceedings of the Eleventh International Conference

on Language Resources and Evaluation (LREC 2018), 2018.

An Extensive Online Examination System With Automatic Assessment Technique

Preprint

Full-text available

Nov 2020

An online examination system is a software solution, which allows any industry or institute to arrange, conduct, and manage examinations via an online environment. Online Examination is an essential ingredient in electronic and interactive learning; both teachers and students are benefited from this. It's very much useful during the current situation of the global pandemic Novel Corona Virus (COVID-19). In this paper, we proposed a system with automatic assessment technique is generated. The algorithms for calculations word frequency, matching keywords, analyzing linguistics, generating grades are proposed in this system. The system is implemented by using PhpStrom and MySQL. The performances of the system is evaluated with a large number of students and questions as well as answers, and we found the absolute (about 0.3%) and relative error (about 3.57%) which is quite satisfactory.

English to Bengali Machine Translation: An Analysis of Semantically Appropriate Verbs

Conference Paper

Full-text available

Oct 2018

Machine translator translates a source language into a target language. Obtaining a semantically valid verbal form during the machine translation is an intricate task. The subsisting translators like "Google Translator" still facing quandaries in this issue of translation from English to Bengali. The Bengali verbal inflection is transmuted to compose verb according to the nature of subject and tense. A sentence may have multiple syntactically valid verb form, which introduces intricacy during the machine translation. This study mainly focuses on the analysis of Bengali person, tense and verbal inflections. This paper describes a procedure for finding semantically valid verb within a sentence during the machine translation from English to Bengali.

A new verb based approach for English to Bangla machine translation

Conference Paper

Full-text available

May 2014

This paper proposes verb based machine translation (VBMT), a new approach of machine translation (MT) from English to Bangla (EtoB). For translation, it simplifies any form (i.e. simple, complex, compound, active and passive form) of English sentence into the simplest form of English sentence i.e. subject plus verb plus object. When compared with existing rule based EtoB MT schemes, VBMT doesn't employ exclusive or individual structural rules of various English sentences; it only detects the main verb from any form of English sentence and then transforms it into the simplest form of English sentence. Thus VBMT can translate from EtoB very simply, correctly and efficiently. Rule based EtoB MT is tough because it requires the matching of sentences with the stored rules. Moreover, many existing EtoB MT schemes which deploy rules are almost inefficient to translate complex or complicated sentences because it is difficult to match them with well-established rules of English grammar. VBMT is efficient because after identifying the main verb of any form of English sentence, it binds the remaining parts of speech (POS) as subject and object. VBMT has been successfully implemented for the MT of Assertive, Interrogative, Imperative, Exclamatory, Active-Passive, Simple, Complex, and Compound form of English sentences applicable in both desktop and mobile applications.

A New Approach To Develop An English To Bangla Machine Translation System

Article

Full-text available

Jul 2011

Machine translation (MT) is always a challenging job. It is really difficult to build up a complete machine translation system for natural languages. Machine translation includes natural language understanding and generation. The proposed system represents a new solution for building a MT system for English to Bangla translation, by modifying the rule-based transfer approach of MT system. In machine translation the searching of word from the lexicon is a compulsory task, here this searching stage is utilized efficiently by proposing an intelligent integer based lexicon system, consists of a number of separate lexicons and an algorithm is also developed for searching words from the lexicon in order to accomplish the basic steps of machine translation.

English to Bangla phrase-based machine translation

Article

Full-text available

Jan 2010

Machine Translation (MT) is the task of automatically translating a text from one language to another. In this work we de-scribe a phrase-based Statistical Machine Translation (SMT) system that translates English sentences to Bangla. A translit-eration module is added to handle out-of-vocabulary (OOV) words. This is es-pecially useful for low-density languages like Bangla for which only a limited amount of training data is available. Fur-thermore, a special component for han-dling preposition is implemented to treat systematic grammatical differences be-tween English and Bangla. We have shown the improvement of our system through effective impacts on the BLEU, NIST and TER scores. The overall BLEU score of our system is 11.7 and for short sentences it is 23.3.

An empirical framework for parsing Bangla imperative, optative and exclamatory sentences

Conference Paper

Feb 2017

Parsing is one of the most challenging task in the field of natural language processing and it plays an important role in order to analyze any natural language. To determine a legal structure for a sentence, we need to expose the rules of how sentences of a language are embodied and have a parsing algorithm to analyze sentences using those rules. This paper proposes a new technique to parse the Bangla sentences including imperative, optative and exclamatory sentences using a set of context sensitive grammars (CSG's) rules. This paper considers Bangla sentences based on the intonation or mood of the sentences rather than the structure of the sentences for parsing. The proposed framework can parse Bangla sentences with over 81% accuracy which is quite satisfactory.

An empirical framework for parsing Bangla assertive, interrogative and imperative sentences

Conference Paper

Nov 2015

To interpret language we need to determine a sentence structure. To do this we know the rule of how sentences of a language are organized and have an algorithm to analyze sentences given those rules. Parsing serves in language to combine the meaning of words and phrases. Parsing a sentence then involves finding a possible legal structure for sentence. This paper proposes a set of context-sensitive grammars (CSG's) to parse the Bangla sentences including assertive, interrogative and imperative. Experimental result reveals that the proposed framework can parse Bangla of sentences with over 80% accuracy.

Machine Translation from Japanese into English

Article

Aug 1986

This paper describes the outline of our Japanese to English machine translation system, which is supported by the Agency of Science and Technology of the Japanese Government. Many new methodologies are introduced to obtain high-quality translation results. The analysis is based on case grammar, which is suitable for a word-order-free language such as Japanese. The dictionary is rich enough to handle many specific expressions. It contains not only case frame information, but also semantic information, idiomatic expressions, and many others. In the transfer phase, the system applies many structural transformations, so that the structural difference of the same contents in Japanese and English can be relieved. In the generation phase, many structural transformations are again applied so that the ellipsis problems can be avoided, and that better stylistic expressions can be obtained. The system is running mainly for the abstracts of scientific and technical papers. The evaluation method of the translated results is also discussed, with many example translations.

Tense based english to bangla translation using mt system

Jan 2013
30-38

K Muntarina
M G Moazzam
A Bhuiyan

K. Muntarina, M. G. Moazzam, and M. A.-A. Bhuiyan, "Tense based english to bangla translation using mt system," International Journal of Engineering Science Invention, vol. 2, no. 10, pp. 30-38, 2013.

No more beating about the bush: A step towards idiom handling for indian language nlp

Jan 2018

R Agrawal
V C Kumar
V Muralidharan
D M Sharma

R. Agrawal, V. C. Kumar, V. Muralidharan, and D. M. Sharma, "No more beating about the bush: A step towards idiom handling for indian language nlp," in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018.

No more beating about the bush: A step towards idiom handling for indian language nlp

agrawal

Preprint

Full-text available

An Emperical Framework of Idioms Translator From Bengali to English: Rule Based Approach

April 2020

Idioms are taking a vital part in effective communication as well as a crucial part of cultural inheritance. It represents the group of words together have the meaning which is different from an individual word meaning, for this metaphorical behavior idioms arise difficulties in the general machine translation system. In this paper, we have proposed a framework for translating Bengali to English. ... [Show full abstract] Context sensitive grammar rules are created for parsing. The top-down algorithm is used for parsing the sentences. We have proposed an algorithm for translating idioms in sentences. The proposed system is implemented and tested with about 15000 sentences. The performance analysis of the system gives 85.33% accuracy, which is quite satisfactory.

Conference Paper

An empirical machine translation framework for translating bangla imperative, optative and exclamato...

May 2016

A set of Context Sensitive Grammar (CSG) rules to translate Bangla imperative, optative and exclamatory sentences into English are introduced in this paper. In this paper, sentences are considered according to the function and purpose of the user rather than structure of the sentence. Three algorithms are implemented to complete major three steps of machine translation system (i.e., parsing, ... [Show full abstract] transfer and generation). The experimental results shows that the performance of the proposed machine translation framework is quite appeasement and efficiency is compared with Google Translator for some selected sentences which are quite satisfactory.

Conference Paper

An empirical framework for parsing Bangla imperative, optative and exclamatory sentences

February 2017

Parsing is one of the most challenging task in the field of natural language processing and it plays an important role in order to analyze any natural language. To determine a legal structure for a sentence, we need to expose the rules of how sentences of a language are embodied and have a parsing algorithm to analyze sentences using those rules. This paper proposes a new technique to parse the ... [Show full abstract] Bangla sentences including imperative, optative and exclamatory sentences using a set of context sensitive grammars (CSG's) rules. This paper considers Bangla sentences based on the intonation or mood of the sentences rather than the structure of the sentences for parsing. The proposed framework can parse Bangla sentences with over 81% accuracy which is quite satisfactory.

Article

Full-text available

Stochastic Approach of Parsing Bengali Sentences

July 2021 · GUB Journal of Science and Engineering

The parsing technique based on associate grammar rules as well as probability is called stochastic parsing. This paper suggested a probabilistic method to eliminate the uncertainty from the sentences of Bangla. The technique of Binarization is applied to increase the precision of the parsing. CYK algorithm is used in this paper. The work mainly focused on intonation-based sentences, for these ... [Show full abstract] reasons PCFGs (Probabilistic Context-Free Grammars) is based on proposed. About 30324 words are used to test the proposed system; average 93% accuracy is achieved. GUB JOURNAL OF SCIENCE AND ENGINEERING, Vol 7, Dec 2020 P 51-56

Last Updated: 05 May 2024