Security of deep reinforcement learning

dc.contributor.authorBehzadan, Vahid
dc.date.accessioned2019-07-01T13:59:33Z
dc.date.available2019-07-01T13:59:33Z
dc.date.graduationmonthAugusten_US
dc.date.issued2019-08-01
dc.date.published2019en_US
dc.description.abstractSince the inception of Deep Reinforcement Learning (DRL) algorithms, there has been a growing interest from both the research and the industrial communities in the promising potentials of this paradigm. The list of current and envisioned applications of deep RL ranges from autonomous navigation and robotics to control applications in the critical infrastructure, air traffic control, defense technologies, and cybersecurity. While the landscape of opportunities and the advantages of deep RL algorithms are justifiably vast, the security risks and issues in such algorithms remain largely unexplored. It has been shown that DRL algorithms are very brittle in terms of their sensitivity to small perturbations of their observations of the state. Furthermore, recent reports demonstrate that such perturbations can be applied by an adversary to manipulate the performance and behavior of DRL agents. To address such problems, this dissertation aims to advance the current state of the art in three separate, but interdependent directions. First, I build on the recent developments in adversarial machine learning and robust reinforcement learning to develop techniques and metrics for evaluating the resilience and robustness of DRL agents to adversarial perturbations applied to the observations of state transitions. A main objective of this task is to disentangle the vulnerabilities in the learned representation of state from those that stem from the sensitivity of DRL policies to changes in transition dynamics. A further objective is to investigate evaluation methods that are independent of attack techniques and their specific parameters. Accordingly, I develop two DRL-based algorithms that enable the quantitative measurement and benchmarking of worst-case resilience and robustness in DRL policies. Second, I present an analysis of \emph{adversarial training} as a solution to the brittleness of Deep Q-Network (DQN) policies, and investigate the impact of hyperparameters on the training-time resilience of policies. I also propose a new exploration mechanism for sample-efficient adversarial training of DRL agents. Third, I address the previously unexplored problem of model extraction attacks on DRL agents. Accordingly, I demonstrate that imitation learning techniques can be used to effectively replicate a DRL policy from observations of its behavior. Moreover, I establish that the replicated policies can be used to launch effective black-box adversarial attacks through the transferability of adversarial examples. Lastly, I address the problem of detecting replicated models by developing a novel technique for embedding sequential watermarks in DRL policies. The dissertation concludes with remarks on the remaining challenges and future directions of research in emerging domain of DRL security.en_US
dc.description.advisorArslan Muniren_US
dc.description.advisorWilliam H. Hsuen_US
dc.description.degreeDoctor of Philosophyen_US
dc.description.departmentDepartment of Computer Scienceen_US
dc.description.levelDoctoralen_US
dc.description.sponsorshipNational Science Foundationen_US
dc.identifier.urihttp://hdl.handle.net/2097/39799
dc.language.isoen_USen_US
dc.subjectReinforcement learningen_US
dc.subjectMachine learningen_US
dc.subjectAdversarial machine learningen_US
dc.subjectPolicy learningen_US
dc.subjectSecurityen_US
dc.subjectArtificial Intelligenceen_US
dc.titleSecurity of deep reinforcement learningen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
VahidBehzadan2019.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format
Description:
Dissertation
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: