Morton, Cory2025-11-202025-11-202025https://hdl.handle.net/2097/47039Establishing a long-term human presence on the lunar surface necessitates robust autonomous systems capable of navigating unpredictable terrain with limited external control. Surface mobility is challenged by diverse slopes, abrasive regolith, and low-gravity conditions, all while operating under strict power and processing constraints. This thesis investigates the application of Deep Reinforcement Learning (DRL) to train a neural network-based motion controller for a legged robotic platform, with the goal of deploying the trained controller to a low-resource microcontroller. Using the MuJoCo-based Ant environment from the Gymnasium API, a DRL agent was trained with the Proximal Policy Optimization (PPO) algorithm to develop stable and adaptive locomotion policies. The trained model was then quantized and converted using the ExecuTorch framework for deployment on an Arduino Nano BLE 33 Sense microcontroller, representative of the limited processing resources expected in real-world space missions. Performance was evaluated by comparing the behavior of the controller on the development machine to its performance on the microcontroller. Results showed that the deployed model could maintain similar performance to the trained model. While performance degradation due to conversion and hardware limitations was observed, the study offers a practical roadmap for bridging high-performance simulation with real-time embedded control. These findings contribute to advancing autonomous mobility systems for extraterrestrial exploration, emphasizing the potential of DRL and TinyML to operate effectively in low-power, communication-limited environments.en-USDeep Reinforcement Learning (DRL)TinyMLMotion controlLegged roboticsEmbedded systemsProximal Policy Optimization (PPO)Investigating legged robots' mobility control on simple surfaces using a light-weight machine learning architectureThesis