Multi-agent based reinforcement learning framework for multi objective resource provisioning in cloud environments

Loading...
Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Cloud computing provides a powerful and flexible platform for executing large-scale and complex applications through a pay-as-you-go model. Due to its scalability, elas- ticity, and economic benefits, many enterprises now rely on cloud services for their business-critical operations. To meet growing demand, improve fault tolerance, and avoid vendor lock-in, organizations are increasingly adopting multi-cloud environ- ments, leveraging the diverse capabilities of multiple cloud providers across different pricing models and geographic regions. However, efficient and dynamic resource allocation in multi-cloud environments remains a significant challenge, particularly when aiming to balance conflicting objec- tives such as minimizing cost, meeting deadlines, and maintaining high quality of ser- vice. Traditional static scheduling methods lack the adaptability required for these dy- namic, heterogeneous environments. Existing multi-objective provisioning techniques often struggle with scalability and responsiveness, especially under fluctuating and bursty workloads. To address these limitations, this thesis introduces MARL4RP (Multi-Agent Re- inforcement Learning for Resource Provisioning), a novel framework that leverages the power of reinforcement learning to tackle the dual challenges of resource pro- visioning and workflow scheduling in multi-cloud Infrastructure-as-a-Service (IaaS) environments. MARL4RP employs a decentralized, multi-agent reinforcement learn- ing architecture that enables the system to learn optimal policies dynamically, with the goal of minimizing both operational costs and task execution latency. The framework was evaluated using two state-of-the-art reinforcement learning al- gorithms, Proximal Policy Optimization (PPO) and Deep Q-Network (DQN), across varying task loads. Experimental results reveal that PPO consistently outperforms DQN, particularly under high and bursty workloads, achieving lower VM costs and improved execution times. While DQN demonstrates efficiency at lighter loads, PPO proves more robust and scalable as workload intensity increases. These findings underscore the potential of MARL4RP as an intelligent and adaptive resource provisioning solution for dynamic and large-scale multi-cloud environments, paving the way for future research in autonomous cloud infrastructure management

Description

Citation

Jayaweera, P.S.L.A. (2025). Multi-agent based reinforcement learning framework for multi objective resource provisioning in cloud environments [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24505

DOI

Endorsement

Review

Supplemented By

Referenced By