Committed to connecting the world

Simultaneous beam selection and users scheduling evaluation in a virtual world with reinforcement learning

Simultaneous beam selection and users scheduling evaluation in a virtual world with reinforcement learning

Authors: Ilan Correa, Ailton Oliveira, Bojian Du, Cleverson Nahum, Daisuke Kobuchi, Felipe Bastos, Hirofumi Ohzeki, Joao Borges, Mohit Mehta, Pedro Batista, Ryoma Kondo, Sundesh Gupta, Vimal Bhatia, Aldebaro Klautau
Status: Final
Date of publication: 22 September 2022
Published in: ITU Journal on Future and Evolving Technologies, Volume 3 (2022), Issue 2, Pages 202-213
Article DOI : https://doi.org/10.52953/CHUZ8770
Abstract:
The fifth generation of mobile networks evolved to serve applications with distinct requirements, which results in a high management complexity due to simultaneous real-time tasks. In the physical layer, code words that allow proper data exchange between the Base Station (BS) and the served users must be chosen. While, in higher layers, the BS must choose users to be served in a given transmission opportunity. There are approaches based on Machine Learning (ML) to solve these combined tasks. However, due to the high amount of possible inputs, a challenge is the availability of data to train the models. In some cases, there may not even exist a predefined optimal answer to use as a "label" for supervised approaches. In this paper, we evaluate solutions for the combined problems of beam selection and user scheduling with Reinforcement Learning (RL), which does not need labels, as a solution for problems without a predefined answer. The algorithms were proposed for Problem Statement 6 of the challenge organized by the International Telecommunication Union (ITU) in 2021, which ranked as the finalists. We compare the approaches in relation to the cumulative reward received by the agents and show a performance comparison of different RL approaches by comparing them with baselines developed for the challenge. The paper also shows how the action taken by the trained agents affect network operation by comparing the number of packets transmitted, which is highly related to the proper selection of users and code words.

Keywords: Beam selection, reinforcement learning, user scheduling, virtual world
Rights: © International Telecommunication Union, available under the CC BY-NC-ND 3.0 IGO license.
electronic file
ITEM DETAILARTICLEPRICE
ENGLISH
PDF format   Full article (PDF)
Free of chargeDOWNLOAD