Reinforcement learning control of robot manipulator

COTRIM, LUCAS P.JOSE, MARCOS M.CABRAL, EDUARDO L.L.2022-03-152022-03-152021COTRIM, LUCAS P.; JOSE, MARCOS M.; CABRAL, EDUARDO L.L. Reinforcement learning control of robot manipulator. <b>Revista Brasileira de Computação Aplicada</b>, v. 13, n. 3, p. 42-53, 2021. DOI: <a href="https://dx.doi.org/10.5335/rbca.v13i3.12091">10.5335/rbca.v13i3.12091</a>. Disponível em: http://repositorio.ipen.br/handle/123456789/32793.2176-6649http://repositorio.ipen.br/handle/123456789/32793Since the establishment of robotics in industrial applications, industrial robot programming involves the repetitive and time-consuming process of manually specifying a fixed trajectory, resulting in machine idle time in production and the necessity of completely reprogramming the robot for different tasks. The increasing number of robotics applications in unstructured environments requires not only intelligent but also reactive controllers due to the unpredictability of the environment and safety measures, respectively. This paper presents a comparative analysis of two classes of Reinforcement Learning algorithms, value iteration (Q-Learning/DQN) and policy iteration (REINFORCE), applied to the discretized task of positioning a robotic manipulator in an obstacle-filled simulated environment, with no previous knowledge of the obstacles’ positions or of the robot arm dynamics. The agent’s performance and algorithm convergence are analyzed under different reward functions and on four increasingly complex test projects: 1-Degree of Freedom (DOF) robot, 2-DOF robot, Kuka KR16 Industrial robot, Kuka KR16 Industrial robot with random setpoint/obstacle placement. The DQN algorithm presented significantly better performance and reduced training time across all test projects, and the third reward function generated better agents for both algorithms.42-53openAccesscontrol equipmentrobotsmanipulatorslearningartificial intelligenceneural networksReinforcement learning control of robot manipulatorArtigo de periódico31310.5335/rbca.v13i3.120910000-0001-6632-2692Sem PercentilSem Percentil CiteScore