10/16/2024
By Yasin Findik

The Kennedy College of Sciences, Department of Computer Science, invites you to attend a doctoral dissertation proposal defense by Yasin Findik on “Advanced Cooperation Algorithms in MARL: From Discrete to Continuous”

Candidate: Yasin Findik
Date: Monday, October 28, 2024
Time: 10 a.m. to 11:30 a.m. (EST)
Location: DAN 309

Committee Members:
Reza Azadeh (Advisor), Miner School of Computer & Information Sciences, University of Massachusetts Lowell
Tingjian Ge (Member), Miner School of Computer & Information Sciences, University of Massachusetts Lowell
Matteo Leonetti (Member), Computer Science, King's College London
Hadi Amiri (Member), Miner School of Computer & Information Sciences, University of Massachusetts Lowell

Title: Advanced Cooperation Algorithms in MARL: From Discrete to Continuous

Abstract:

The rapid advancement of intelligent systems has sparked significant interest in reinforcement learning (RL) due to its potential in enabling autonomous agents to learn optimal behaviors through interactions with their environment. As the complexity of these environments and tasks increases, the need for agents to operate both independently and collaboratively has become more apparent. This has led to the emergence of multi-agent reinforcement learning (MARL), a field focused on developing frameworks that allow multiple agents to cooperate, compete, or coexist to achieve individual or collective objectives. MARL has become increasingly important in real-world applications such as robotics, autonomous vehicles, and finance, where agents must make decisions based on their own actions and those of others.

This thesis explores the challenges and opportunities presented by MARL in real-world applications, where agents must operate under conditions of partial observability, non-stationarity, and the need for coordinated decision-making. Traditional single-agent reinforcement learning (SARL) methods, although effective in isolated environments, fall short when applied to multi-agent settings due to the additional complexity introduced by inter-agent interactions. These complexities necessitate more sophisticated algorithms capable of managing decentralized decision-making, enhancing cooperation, and ensuring robustness. To address these issues, this thesis introduces novel methods aimed at improving the efficiency and effectiveness of MARL systems in both discrete and continuous action domains, leveraging recent advances in deep learning and RL.

First, the thesis introduces a novel relational-awareness (RA) based cooperation strategy, which allows agents to work together more effectively by incorporating awareness of the relationships between agents. We evaluate the effectiveness of our proposed approach by conducting fifteen experiments in two different discrete environments. The results demonstrate that our proposed algorithm can influence and shape team behavior, guide cooperation strategies, and expedite agent learning. Therefore, our approach shows promise for use in multi-agent systems, especially when agents have diverse properties.

Another key contribution is the development of Mixed Q-Functionals (MQF), a value-based algorithm designed for continuous action domains, which significantly outperforms existing methods in terms of performance and fostering collaboration among agents. We evaluate the efficacy of our algorithm in six cooperative multi-agent scenarios within continuous environments. Our empirical findings reveal that MQF outperforms four variants of Deep Deterministic Policy Gradient through rapid action evaluation and increased sample efficiency.

Furthermore, this thesis presents the Collaborative Adaptation (CA) Framework, which leverages relational networks to improve the resilience of multi-agent systems in scenarios involving unexpected failures. Empirical evaluations across both discrete and continuous environments demonstrate that, in scenarios involving unforeseen malfunction, although state-of-the-art algorithms often converge on sub-optimal solutions, the proposed CA framework not only mitigates and recovers more effectively but also offer valuable insights into the practical deployment of MARL algorithms in dynamic, real-world applications.