基于強化學(xué)習的無(wú)人機網(wǎng)絡(luò )資源分配研究

首頁(yè) > 過(guò)刊瀏覽>2024年第32卷第1期 >297-303

基于強化學(xué)習的無(wú)人機網(wǎng)絡(luò )資源分配研究
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者單位:中國電子科技集團公司第五十四研究所
作者簡(jiǎn)介:
通訊作者:
中圖分類(lèi)號:
基金項目:

Research on Resource Allocation in UAV Networks Based on Reinforcement Learning

Author:

Affiliation:

Fund Project:

摘要

圖/表

訪(fǎng)問(wèn)統計

參考文獻

相似文獻

引證文獻

資源附件

文章評論

摘要:

以無(wú)人機網(wǎng)絡(luò )的資源分配為研究對象，研究了基于強化學(xué)習的多無(wú)人機網(wǎng)絡(luò )動(dòng)態(tài)時(shí)隙分配方案，在無(wú)人機網(wǎng)絡(luò )中，合理地分配時(shí)隙資源對改善無(wú)人機資源利用率具有重要意義；針對動(dòng)態(tài)時(shí)隙分配問(wèn)題，根據調度問(wèn)題的限制條件，建立了多無(wú)人機網(wǎng)絡(luò )時(shí)隙分配模型，提出了一種基于近端策略?xún)?yōu)化(PPO)強化學(xué)習算法的時(shí)隙分配方案，并進(jìn)行強化學(xué)習算法的環(huán)境映射，建立馬爾可夫決策過(guò)程(MDP)模型與強化學(xué)習算法接口相匹配；在gym仿真環(huán)境下進(jìn)行模型訓練，對提出的時(shí)隙分配方案進(jìn)行驗證，仿真結果驗證了基于近端策略?xún)?yōu)化強化學(xué)習算法的時(shí)隙分配方案在多無(wú)人機網(wǎng)絡(luò )環(huán)境下可以高效進(jìn)行時(shí)隙分配，提高網(wǎng)絡(luò )信道利用率，提出的方案可以根據實(shí)際需求適當縮短訓練時(shí)間得到較優(yōu)分配結果。

Abstract:

Taking the resource allocation of UAV networks as the research object, a dynamic time slot allocation scheme in multi-UAV networks based on reinforcement learning is investigated. In UAV networks, it is important to reasonably allocate time slot resources to improve UAV resource utilization. Aiming at the dynamic time slot allocation problem, the time slot allocation model of multi-UAV network is established according to the constraints of the scheduling problem. A time slot allocation scheme based on the proximal policy optimization (PPO) reinforcement learning algorithm is proposed. The environment mapping of the reinforcement learning algorithm is also carried out. Build a Markov decision process (MDP) model to match the reinforcement learning algorithm interface. Model training is performed in the gym simulation environment to validate the proposed time slot allocation scheme. The simulation results verify that the time slot allocation scheme based on the proximal policy optimization reinforcement learning algorithm can efficiently perform time slot allocation and improve the network channel utilization in a multi-UAV network environment. The proposed scheme can reduce the training time appropriately to obtain better allocation results according to the actual demand.

參考文獻

相似文獻

引證文獻

引用本文

范文帝,王俊芳,黨甜,杜龍海,陳叢.基于強化學(xué)習的無(wú)人機網(wǎng)絡(luò )資源分配研究計算機測量與控制[J].,2024,32(1):297-303.

復制

文章指標

點(diǎn)擊次數:
下載次數:
HTML閱讀次數:
引用次數:

歷史

收稿日期:2023-11-02
最后修改日期:2023-11-13
錄用日期:2023-11-14
在線(xiàn)發(fā)布日期: 2024-01-29
出版日期:

国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

引用本文

分享

文章指標

歷史

文章二維碼