/TSRJ-ITC

The recognition phase of an Optical Character Recognition (OCR) system produces a ranked list of candidate characters, among which the top one is usally taken as recognition result without taking context into account. Recognition error occurs if the correct character is not at the top, which is mostly due to shape similarity between characters.In this paper we propose to use character trigram, which means that two previous characters are taken into account when choosing the character from the candidate list as recognition result for Khmer OCR.A text corpus of about 300 Mbytes is used to compute character trigrams. Using these trigrams, we test our approach on about 3000 characters. The result shows that this approach can correct about 30% of recognition errors.

Search for Article

Journal Menu

Latest Issue

Empowering Education with Online Khmer Handwritten Text Recognition for Teaching and Learning Assistance

Undergraduate Student Dropout Prediction with Class Balancing Techniques

Status of Seawater Quality at Koh Rong Island, Sihanoukville, Cambodia

Low-Complexity Detection of Primary Synchronization Signal for 5G New Radio Terrestrial Cellular System

Word Spotting on Khmer Printed Documents

Tuning Hyperparameters Learning Rate and Gamma in Gym Environment Inverted Pendulum

Examining Passenger Loyalty in Phnom Penh Public Bus System: A Structural Equation Modelling Approach

Prediction on Load model for future load profile of Electric Vehicle charging demand in Phnom Penh

Economic Study on Integrating PV-DG with Grid-Tie: Case Study in Cambodia

Techno-Economic Comparison of VRF and Water-cooled Chiller System at the Ministry of Tourism in Sihanouk Province, Cambodia

Improving Recognition Result Using Character Trigram for Khmer OCR

Journal Menu

Contact us

Hosting by