Corresponding examples Tibetan characters and Unicode coding.

Source publication

Convolutional neural network configuration.

Annotating the character block to be recognized.

Construction of a Character Dataset for Historical Uchen Tibetan Documents under Low-Resource Conditions

Article

Full-text available

Nov 2022

The construction of a character dataset is an important part of the research on document analysis and recognition of historical Tibetan documents. The results of character segmentation research in the previous stage are presented by coloring the characters with different color values. On this basis, the characters are annotated, and the character i...

Context 1

... Tibetan has only top-down superposition, up to 7 layers at most. In the Unicode encoding scheme of Tibetan [18], the encoding length of the same character is inconsistent (Table 1). Therefore, the annotated character text needs to be processed with a unified coding method, that is, the way that all annotated text is unified into the least number of coding units. ...

View in full-text

Autoencoder Image Denoising to Increase Optical Character Recognition Performance in Text Conversion

Conference Paper

Full-text available

Nov 2022

Corresponding examples Tibetan characters and Unicode coding.

Context in source publication

Similar publications