Latest Issue
Empowering Education with Online Khmer Handwritten Text Recognition for Teaching and Learning Assistance
Published: August 30,2025Undergraduate Student Dropout Prediction with Class Balancing Techniques
Published: August 30,2025Status of Seawater Quality at Koh Rong Island, Sihanoukville, Cambodia
Published: August 30,2025Low-Complexity Detection of Primary Synchronization Signal for 5G New Radio Terrestrial Cellular System
Published: August 30,2025Word Spotting on Khmer Printed Documents
Published: August 30,2025Tuning Hyperparameters Learning Rate and Gamma in Gym Environment Inverted Pendulum
Published: August 30,2025Examining Passenger Loyalty in Phnom Penh Public Bus System: A Structural Equation Modelling Approach
Published: August 30,2025Prediction on Load model for future load profile of Electric Vehicle charging demand in Phnom Penh
Published: August 30,2025Economic Study on Integrating PV-DG with Grid-Tie: Case Study in Cambodia
Published: August 30,2025Tuning Hyperparameters Learning Rate and Gamma in Gym Environment Inverted Pendulum
-
1. Department of Telecommunication and Network Engineering, Institute of Technology of Cambodia, Russian Federation Blvd., P.O. Box 86, Phnom Penh, Cambodia
Received: August 28,2024 / Revised: September 11,2024 / / Accepted: September 19,2024 / Available online: August 30,2025
Abstract: In this paper, reinforcement learning, as a subfield of artificial intelligence and machine learning, is examined based on its capacity for interdependent statistics, optimization, and mathematical concepts and how agents can use trial-error learning to solve tasks. Furthermore, fine-tuning of parameters in machine learning is the ability of scientists to directly influence the architecture, usability, and effectiveness of models in finding the right solution for a particular problem. This paper investigates the impact of learning rate and gamma on the reinforcement algorithm of deep reinforcement learning in a gym environment using inverted pendulum version 4 (inverted pendulumv4) that will acted as agent on Google Collab. As a result, the model's hyperparameter choice significantly impacts the agent's performance in terms of sample efficiency, learning stability, and cumulative rewards, determining how quickly and reliably the agent can learn from interactions, maintain consistent performance across training sessions, and achieve optimal outcomes over time. The careful tuning of these hyperparameters is therefore crucial to maximizing the agent's effectiveness in various environments, as even small adjustments can lead to substantial differences in performance metrics. From experiments, the optimal results were observed with a learning rate of 0.0001 and a Gamma value of 0.99. These settings yielded the highest cumulative rewards and demonstrated effective learning stability comparing to others value.