基于自適應狀態(tài)聚集Q學(xué)習的移動(dòng)機器人動(dòng)態(tài)規劃方法

首頁(yè) > 過(guò)刊瀏覽>2014年第22卷第10期 >3419-3422

基于自適應狀態(tài)聚集Q學(xué)習的移動(dòng)機器人動(dòng)態(tài)規劃方法
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者單位:(1.江蘇大學(xué) 計算機科學(xué)與通信工程學(xué)院,江蘇 鎮江 212013; ;2.鎮江高等專(zhuān)科學(xué)校 電子信息系,江蘇 鎮江 212000)
作者簡(jiǎn)介:王 輝(1980),女,江蘇丹陽(yáng)人,講師,碩士研究生,主要從事虛擬現實(shí)和人工智能方向的研究。
通訊作者:
中圖分類(lèi)號:TP393
基金項目:江蘇省高校自然科學(xué)研究計劃(03kjd520075)。

A Dynamic Planning Method for Mobile Robot Based on Adaptive State Aggregating Q-Learning

Author:

Affiliation:

(1.School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 212013, China ;2. Electron&Information Department,Zhenjiang College,Zhenjiang 212000,China)

Fund Project:

摘要

圖/表

訪(fǎng)問(wèn)統計

參考文獻

相似文獻

引證文獻

資源附件

文章評論

摘要:

針對現有移動(dòng)機器人路徑規劃方法存在的收斂速度慢和難以進(jìn)行在線(xiàn)規劃的問(wèn)題,研究了一種基于狀態(tài)聚集SOM網(wǎng)和帶資格跡Q學(xué)習的移動(dòng)機器人路徑動(dòng)態(tài)規劃方法——SQ(λ)；首先,設計了系統的總體閉環(huán)規劃模型,將整個(gè)系統分為前端(狀態(tài)聚集)和后端(路徑規劃)；然后,在傳統的SOM基礎上增加輸出層構建出三層的SOM網(wǎng)實(shí)現對移動(dòng)機器人狀態(tài)的聚集,并給出了三層SOM網(wǎng)的訓練算法；最后,基于聚集的狀態(tài)提出了一種基于帶資格跡和探索因子自適應變化的改進(jìn)Q學(xué)習算法實(shí)現最優(yōu)策略的獲取,并能根據改進(jìn)Q學(xué)習算法的收斂速度自適應地控制前端SOM輸出層神經(jīng)元的增減,從而改進(jìn)整體算法的收斂性能；仿真實(shí)驗表明:文中設計的SQ(λ)能有效地實(shí)現移動(dòng)機器人的路徑規劃,較其它算法相比,具有收斂速度快和尋優(yōu)能力強的優(yōu)點(diǎn),具有較大的優(yōu)越性。

Abstract:

Aiming at the given path planning method for mobile robot has the slow convergence rate and hard to plan on-line, a dynamic path planning method based on state aggregating SOM net and Q-Learning is researched. Firstly, the planning model of whole system is designed and it is divided into two parts such as frontier part (state aggregating) and back part (path planning), then the three-layer SOM net is developed to realize the aggregation of states based on the traditional SOM, the training algorithm for three-layer SOM net is given. Finally, a algorithm for obtaining the optimal strategy based on eligibility trace and adaptive changing explore factor is proposed, and the number of output nodes of SOM can be adaptive increase or decrease according to the convergence extent of the Q(λ), therefore, the whole convergence can be guaranteed by the improved algorithm. The simulation experiment shows the method designed can realize the path planning, and compared with the other methods, it has the rapid convergence rate and the ability to get the optimal solution, and it is proved to be has big priority over the other methods.

參考文獻

相似文獻

引證文獻

引用本文

王輝,宋昌統.基于自適應狀態(tài)聚集Q學(xué)習的移動(dòng)機器人動(dòng)態(tài)規劃方法計算機測量與控制[J].,2014,22(10):3419-3422.

復制

文章指標

點(diǎn)擊次數:
下載次數:
HTML閱讀次數:
引用次數:

歷史

收稿日期:
最后修改日期:
錄用日期:
在線(xiàn)發(fā)布日期: 2015-01-15
出版日期:

国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

引用本文

分享

文章指標

歷史

文章二維碼