国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

多模態(tài)特征融合的長(cháng)視頻行為識別方法
DOI:
CSTR:
作者:
作者單位:

西安建筑科技大學(xué)

作者簡(jiǎn)介:

通訊作者:

中圖分類(lèi)號:

基金項目:

陜西省自然科學(xué)基金面上項目(2020JM-473,2020JM-472)、西安建筑科技大學(xué)基礎研究基金項目(JC1703)、西安建筑科技大學(xué)自然科學(xué)基金項目(ZR19046)


Long Video Action Recognition Method Based on Multimodal Feature Fusion
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 圖/表
  • |
  • 訪(fǎng)問(wèn)統計
  • |
  • 參考文獻
  • |
  • 相似文獻
  • |
  • 引證文獻
  • |
  • 資源附件
  • |
  • 文章評論
    摘要:

    行為識別技術(shù)在視頻檢索具有重要的應用價(jià)值。針對基于卷積神經(jīng)網(wǎng)絡(luò )的行為識別方法存在的長(cháng)時(shí)序行為識別能力不足、尺度特征提取困難、光照變化及復雜背景干擾等問(wèn)題,提出一種多模態(tài)特征融合的長(cháng)視頻行為識別方法。首先,考慮到長(cháng)時(shí)序行為幀間差距較小,易造成視頻幀的冗余,基于此,通過(guò)均勻稀疏采樣策略完成全視頻段的時(shí)域建模,在降低視頻幀冗余度的前提下實(shí)現長(cháng)時(shí)序信息的充分保留;其次,通過(guò)多列卷積獲取多尺度時(shí)空特征,弱化視角變化對視頻圖像帶來(lái)的干擾;后引入光流數據信息,通過(guò)空間注意力機制引導的特征提取網(wǎng)絡(luò )獲取光流數據的深層次特征,進(jìn)而利用不同數據模式之間的優(yōu)勢互補,提高網(wǎng)絡(luò )在不同場(chǎng)景下的準確性和魯棒性。最后,將獲取的多尺度時(shí)空特征和光流信息在網(wǎng)絡(luò )的全連接層進(jìn)行融合,實(shí)現了端到端的長(cháng)視頻行為識別。實(shí)驗結果表明,所提方法在UCF101和HMDB51數據集上平均精度分別為97.2%和72.8%,優(yōu)于其他對比方法,實(shí)驗結果證明了該方法的有效性。

    Abstract:

    Action recognition technology has important application value in video retrieval. In order to solve the problems of convolutional neural network based action recognition methods, such as insufficient ability of long time sequence action recognition, difficulty in scale feature extraction, illumination change and complex background interference, a long-video action recognition method based on multi-mode feature fusion is proposed. Firstly, considering that the gap between the frames of the long-sequence behavior is small, it is easy to cause the redundancy of the video frames. Based on this, the time-domain modeling of the whole video segment is completed by using the uniform sparse sampling strategy, and the long-sequence information is fully retained on the premise of reducing the redundancy of the video frames. Secondly, multi-column convolution is used to obtain multi-scale spatial and temporal features, so as to weaken the interference caused by the change of perspective on video images. Then, the optical flow data information is introduced, and the deep features of the optical flow data are obtained through the feature extraction network guided by the spatial attention mechanism. Furthermore, the complementary advantages among different data modes are utilized to improve the accuracy and robustness of the network in different scenarios. Finally, the obtained multi-scale spatial and temporal features and optical flow information are fused in the full connection layer of the network to realize end-to-end long video action recognition. Experimental results show that the average accuracy of the proposed method on UCF101 and HMDB51 datasets is 97.2% and 72.8%, respectively, which is better than other comparison methods. The experimental results prove the effectiveness of the method.

    參考文獻
    相似文獻
    引證文獻
引用本文

王婷,劉光輝,張鈺敏,孟月波,徐勝軍.多模態(tài)特征融合的長(cháng)視頻行為識別方法計算機測量與控制[J].,2021,29(11):165-170.

復制
分享
文章指標
  • 點(diǎn)擊次數:
  • 下載次數:
  • HTML閱讀次數:
  • 引用次數:
歷史
  • 收稿日期:2021-04-08
  • 最后修改日期:2021-05-11
  • 錄用日期:2021-05-12
  • 在線(xiàn)發(fā)布日期: 2021-11-22
  • 出版日期:
文章二維碼
西盟| 新巴尔虎左旗| 无极县| 康保县| 牟定县| 淮阳县| 清河县| 浮梁县| 阿拉善左旗| 义马市| 青川县| 清新县| 凤城市| 台北市| 迁安市| 西昌市| 陆河县| 巍山| 北京市| 望奎县| 东乡县| 江口县| 新安县| 汶川县| 防城港市| 禹州市| 宁德市| 襄垣县| 定远县| 苍溪县| 东辽县| 汝城县| 宁明县| 宜川县| 南木林县| 克山县| 山阴县| 阜城县| 海阳市| 聂荣县| 萝北县|