Latest Issue
Study on Mechanical Structure Design for Plug-and-play Wheel Mobile Robot
Published: December 31,2023PI Controller for Velocity Controller Design based on Lumped Parameter Estimation: Simulation and Experiment
Published: December 31,2023Attitude Estimation by using Unscented Kalman Filter with Constraint State
Published: December 31,2023Characterization Study of Cambodian Natural Rubber and Clay Composites for Shock Absorption Floor Mat
Published: December 31,2023Selection of Observed Gridded Rainfall Data for different Analysis Purposes in Cambodia
Published: December 31,2023An Empirical Investigation of Gold Price Forecasting Using ARIMA Compare with LSTM Model
Published: December 31,2023Prediction of California Bearing Ratio with Soil Properties of Road Subgrade Materials in Cambodia
Published: December 31,2023Non-intrusive Load Monitoring Classification Based on Multi-Scale Electrical Appliance Load Signature
Published: December 31,2023Development of Control Framework Based on ROS Platform for a 3-Axis Gimbal
Published: December 31,2023Improving Recognition Result Using Character Trigram for Khmer OCR
-
1. Department of Computer Science,
Institute of Technology of Cambodia, Russian Ferderation Blvd., P.O. Box 86, Phnom Penh, Cambodia.
Received: January 20,2024 / Revised: Accepted: January 20,2024 / Published: June 01,2013
The recognition phase of an Optical Character Recognition (OCR) system produces a ranked list of candidate characters, among which the top one is usally taken as recognition result without taking context into account. Recognition error occurs if the correct character is not at the top, which is mostly due to shape similarity between characters.In this paper we propose to use character trigram, which means that two previous characters are taken into account when choosing the character from the candidate list as recognition result for Khmer OCR.A text corpus of about 300 Mbytes is used to compute character trigrams. Using these trigrams, we test our approach on about 3000 characters. The result shows that this approach can correct about 30% of recognition errors.