Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)

Preprint

Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)

Natan, Avraham; Stern, Roni; Kalech, Meir · arXiv (Cornell University) · 2017

Página del recurso

Lectura rápida. Revisá los datos básicos del recurso y luego accedé al contenido desde el botón principal. En esta ficha solo se muestra la información necesaria para identificar la obra, citarla y abrirla.

Autor / responsable

Natan, Avraham; Stern, Roni; Kalech, Meir

Editorial

arXiv (Cornell University)

Año

2017

Idioma

en

Acceso al recurso

Entrá al contenido desde la opción principal o elegí otra fuente disponible.

Acceso principal

Página del recurso

Página de referencia del recurso. El texto completo no está confirmado automáticamente.

Abrir recurso

Resumen

Descripción general del contenido del recurso.

Due to the safety risks and training sample inefficiency, it is often preferred to develop controllers in simulation. However, minor differences between the simulation and the real world can cause a significant sim-to-real gap. This gap can reduce the effectiveness of the developed controller. In this paper, we examine a case study of transferring an octorotor reinforcement learning controller from simulation to the real world. First, we quantify the effectiveness of the real-world transfer by examining safety metrics. We find that although there is a noticeable (around 100%) increase in deviation in real flights, this deviation may not be considered unsafe, as it will be within > 2m safety corridors. Then, we estimate the densities of the measurement distributions and compare the Jensen-Shannon divergences of simulated and real measurements. From this, we show that the vehicle’s orientation is significantly different between simulated and real flights. We attribute this to a different flight mode in real flights where the vehicle turns to face the next waypoint. We also find that the reinforcement learning controller actions appear to correctly counteract disturbance forces. Then, we analyze the errors of a measurement autoencoder and state transition model neural network applied to real data. We find that these models further reinforce the difference between the simulated and real attitude control, showing the errors directly on the flight paths. Finally, we discuss important lessons learned in the sim-to-real transfer of our controller.

Cómo citar

Elegí el formato que necesitás y copiá la referencia al portapapeles.

APA 7

Natan, A, Stern, R, & Kalech, M. (2017). Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper). arXiv (Cornell University). https://doi.org/10.4230/oasics.dx.2024.16

MLA

Natan, Avraham, et al. Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper). arXiv (Cornell University), 2017. https://doi.org/10.4230/oasics.dx.2024.16.

Chicago

Natan, Avraham, Roni Stern, and Meir Kalech. 2017. Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper). arXiv (Cornell University). https://doi.org/10.4230/oasics.dx.2024.16.

Harvard

Natan, A, Stern, R. and Kalech, M. 2017, Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper), arXiv (Cornell University), available at: https://doi.org/10.4230/oasics.dx.2024.16 [Accessed 24 Jun. 2026].

Compartir e imprimir

Guardá la ficha, copiá su enlace permanente o imprimila como PDF.

Exportar referencia

Si usás un gestor bibliográfico, podés exportar el registro en los formatos más comunes.

RIS BibTeX

Detalles del recurso

Información bibliográfica útil para confirmar que se trata del material correcto.

Título: Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)

Autor / colaboradores: Natan, Avraham; Stern, Roni; Kalech, Meir

Editorial: arXiv (Cornell University)

Año de publicación: 2017

Idioma: en

Materias

Explorá otros recursos relacionados a partir de estas materias.

Computer science; Optimization algorithm; Algorithm; Mathematical optimization; Mathematics