Latest Issue
The Negative Experiences of Low-Income Citizen Commute and Their Intentions Toward Public Bus in Phnom Penh
Published: December 31,2025Reliability Study on the Placement of Electric Vehicle Charging Stations in the Distribution Network of Cambodia
Published: December 31,2025Planning For Medium Voltage Distribution Systems Considering Economic And Reliability Aspects
Published: December 31,2025Security Management of Reputation Records in the Self-Sovereign Identity Network for the Trust Enhancement
Published: December 31,2025Effect of Enzyme on Physicochemical and Sensory Characteristics of Black Soy Sauce
Published: December 31,2025Activated Carbon Derived from Cassava Peels (Manihot esculenta) for the Removal of Diclofenac
Published: December 31,2025Impact of Smoking Materials on Smoked Fish Quality and Polycyclic Aromatic Hydrocarbon Contamination
Published: December 31,2025Estimation of rainfall and flooding with remotely-sensed spectral indices in the Mekong Delta region
Published: December 31,2025Tuning Hyperparameters Learning Rate and Gamma in Gym Environment Inverted Pendulum
-
1. Department of Telecommunication and Network Engineering, Institute of Technology of Cambodia, Russian Federation Blvd., P.O. Box 86, Phnom Penh, Cambodia
Received: August 28,2024 / Revised: September 11,2024 / / Accepted: September 19,2024 / Available online: August 30,2025
Abstract: In this paper, reinforcement learning, as a subfield of artificial intelligence and machine learning, is examined based on its capacity for interdependent statistics, optimization, and mathematical concepts and how agents can use trial-error learning to solve tasks. Furthermore, fine-tuning of parameters in machine learning is the ability of scientists to directly influence the architecture, usability, and effectiveness of models in finding the right solution for a particular problem. This paper investigates the impact of learning rate and gamma on the reinforcement algorithm of deep reinforcement learning in a gym environment using inverted pendulum version 4 (inverted pendulumv4) that will acted as agent on Google Collab. As a result, the model's hyperparameter choice significantly impacts the agent's performance in terms of sample efficiency, learning stability, and cumulative rewards, determining how quickly and reliably the agent can learn from interactions, maintain consistent performance across training sessions, and achieve optimal outcomes over time. The careful tuning of these hyperparameters is therefore crucial to maximizing the agent's effectiveness in various environments, as even small adjustments can lead to substantial differences in performance metrics. From experiments, the optimal results were observed with a learning rate of 0.0001 and a Gamma value of 0.99. These settings yielded the highest cumulative rewards and demonstrated effective learning stability comparing to others value.
