Aarthi Anbalagan, Muthuraman Saminathan, and Vincent Kanka. “Reinforcement Learning from Human Feedback for Enhanced Code Generation and Debugging Capabilities in LLMs”. Journal of Computational Intelligence and Robotics 4, no. 1 (April 10, 2024): 152–193. Accessed June 14, 2025. https://nucleuscorp.org/jcir/article/view/563.