Latest Issue
Empowering Education with Online Khmer Handwritten Text Recognition for Teaching and Learning Assistance
Published: August 30,2025Undergraduate Student Dropout Prediction with Class Balancing Techniques
Published: August 30,2025Status of Seawater Quality at Koh Rong Island, Sihanoukville, Cambodia
Published: August 30,2025Low-Complexity Detection of Primary Synchronization Signal for 5G New Radio Terrestrial Cellular System
Published: August 30,2025Word Spotting on Khmer Printed Documents
Published: August 30,2025Tuning Hyperparameters Learning Rate and Gamma in Gym Environment Inverted Pendulum
Published: August 30,2025Examining Passenger Loyalty in Phnom Penh Public Bus System: A Structural Equation Modelling Approach
Published: August 30,2025Prediction on Load model for future load profile of Electric Vehicle charging demand in Phnom Penh
Published: August 30,2025Economic Study on Integrating PV-DG with Grid-Tie: Case Study in Cambodia
Published: August 30,2025Word Spotting on Khmer Palm Leaf Manuscript Documents
-
1. Department of Information and Communication Engineering, Institute of Technology of Cambodia, Russian Federation Blvd., P.O. Box 86, Phnom Penh, Cambodia
Academic Editor:
Received: July 17,2023 / Revised: / Accepted: August 07,2023 / Available online: June 30,2024
Word spotting plays a crucial role in document analysis, particularly for ancient palm leaf manuscripts. Khmer palm leaf manuscripts, which are written on rectangularly cut and dried palm leaf sheets, hold significant cultural value in Cambodia. These manuscripts contain valuable historical, religious, and linguistic information, making their preservation essential. However, extracting information from them is challenging due to their fragility, age, and the complexity of Khmer writing and word formation. This study focuses on word spotting and investigates the construction of a Region Proposal Network (RPN) using the You Only Look Once (YOLO) technique and Convolutional Neural Network (CNN) for the accurate and efficient identification of specific words or phrases within the documents. The proposed method is evaluated using the SleukRith dataset, which consists of 1,971 images of Khmer palm leaf manuscripts. Among these, 1,379 images are allocated to the training set, 395 to the test set, and approximately 197 to the validation set. Parameter tuning is conducted on two variables: the number of layers and the number of filters. The results demonstrate that the optimal model comprises 3 layers and 24 filters, with a threshold of 0.4. The achieved detection performance accuracy is approximately 80.86%, while the classification performance reaches 69.29% for the 33 classes of Khmer characters.