Latest Issue
THE 13TH SCIENTIFIC DAY (Catalyzing Innovation : Human Capital, Research, and Industry Linkages)
Published: August 23,2024Earth Resources and Geo-Environment Technology
Published: August 20,2024Word Spotting on Khmer Palm Leaf Manuscript Documents
Published: June 30,2024Text Image Reconstruction and Reparation for Khmer Historical Document
Published: June 30,2024Enhancing the Accuracy and Reliability of Docker Image Vulnerability Scanning Technology
Published: June 30,2024Walkability and Importance Assessment of Pedestrian Facilities in Phnom Penh City
Published: June 30,2024Assessment of Proximate Chemical Composition of Cambodian Rice Varieties
Published: June 30,2024Word Spotting on Khmer Palm Leaf Manuscript Documents
-
1. Department of Information and Communication Engineering, Institute of Technology of Cambodia, Russian Federation Blvd., P.O. Box 86, Phnom Penh, Cambodia
Received: July 17,2023 / Revised: Accepted: August 07,2023 / Published: June 30,2024
Word spotting plays a crucial role in document analysis, particularly for ancient palm leaf manuscripts. Khmer palm leaf manuscripts, which are written on rectangularly cut and dried palm leaf sheets, hold significant cultural value in Cambodia. These manuscripts contain valuable historical, religious, and linguistic information, making their preservation essential. However, extracting information from them is challenging due to their fragility, age, and the complexity of Khmer writing and word formation. This study focuses on word spotting and investigates the construction of a Region Proposal Network (RPN) using the You Only Look Once (YOLO) technique and Convolutional Neural Network (CNN) for the accurate and efficient identification of specific words or phrases within the documents. The proposed method is evaluated using the SleukRith dataset, which consists of 1,971 images of Khmer palm leaf manuscripts. Among these, 1,379 images are allocated to the training set, 395 to the test set, and approximately 197 to the validation set. Parameter tuning is conducted on two variables: the number of layers and the number of filters. The results demonstrate that the optimal model comprises 3 layers and 24 filters, with a threshold of 0.4. The achieved detection performance accuracy is approximately 80.86%, while the classification performance reaches 69.29% for the 33 classes of Khmer characters.