基于像素分配的文本檢測方法研究

首頁(yè) > 過(guò)刊瀏覽>2023年第31卷第7期 >21-27

基于像素分配的文本檢測方法研究
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者單位:江南大學(xué)
作者簡(jiǎn)介:
通訊作者:
中圖分類(lèi)號:
基金項目:國家自然科學(xué)基金項目（面上項目，重點(diǎn)項目，重大項目）

Text Detection Based on Pixel to Box Assignment

Author:

Affiliation:

Fund Project:

摘要

圖/表

訪(fǎng)問(wèn)統計

參考文獻

相似文獻

引證文獻

資源附件

文章評論

摘要:

針對現有方法在場(chǎng)景文本檢測上的不足,提出一種基于像素分配方的場(chǎng)景文本檢測方法，并采用了交叉注意力模塊和多尺度特征自適應模塊來(lái)分別在空間和和通道上優(yōu)化特征提取。為了豐富不同尺度的特征表示，采用多尺度特征自適應模塊進(jìn)行自動(dòng)分配不同尺度特征的權重。為了有效獲取上下文信息，將特征網(wǎng)絡(luò )提取到的特征送入交叉注意力模塊。對每個(gè)像素，在其所在的水平路徑和垂直路徑上收集上下文信息。再通過(guò)循環(huán)操作，每一個(gè)像素便可以在全圖范圍內獲取上下文信息。通過(guò)全卷積網(wǎng)絡(luò )方法，使用多任務(wù)學(xué)習框架學(xué)習文本實(shí)例的幾何特征，結合多任務(wù)學(xué)習的結果完成像素到文本框的分配，經(jīng)過(guò)簡(jiǎn)單處理后重建文本實(shí)例的多邊形邊界框。在任意形狀公開(kāi)數據集Total-text上進(jìn)行測試，本文方法的召回率、精確率、F值分別為75.71%、89.15%、81.89%，在多方向公開(kāi)數據集ICDAR2015上也表現良好，經(jīng)實(shí)驗得召回率、精確率、F值分別為79.06%、89.24%、83.84%，證明了本文方法的有效性。

Abstract:

Aiming at the shortcomings of existing methods in scene text detection, a scene text detection method based on pixel allocation is proposed, and a cross-attention module and a multi-scale feature adaptive module are used to optimize feature extraction in space and channel respectively. In order to enrich the feature representations of different scales, a multi-scale feature adaptive module is used to automatically assign the weights of features of different scales. In order to effectively obtain contextual information, the features extracted by the feature network are fed into the cross-attention module. For each pixel, contextual information is collected on its horizontal path and vertical path. Then through the loop operation, each pixel can obtain context information in the whole image. Through the fully convolutional network method, the multi-task learning framework is used to learn the geometric features of the text instance, and the results of the multi-task learning are combined to complete the allocation of pixels to the text box, and the polygonal bounding box of the text instance is reconstructed after simple processing. Tested on the public dataset Total-text with any shape, the recall rate, precision rate, and F value of the method in this paper are 75.71%, 89.15%, and 81.89%, respectively, and it also performs well on the multi-directional public dataset ICDAR2015. The recall rate, precision rate, and F value are 79.06%, 89.24%, and 83.84%, respectively, which proves the effectiveness of the method in this paper.

參考文獻

相似文獻

引證文獻

引用本文

吉訓生,喻智,徐曉祥.基于像素分配的文本檢測方法研究計算機測量與控制[J].,2023,31(7):21-27.

復制

文章指標

點(diǎn)擊次數:
下載次數:
HTML閱讀次數:
引用次數:

歷史

收稿日期:2023-02-08
最后修改日期:2023-03-06
錄用日期:2023-03-06
在線(xiàn)發(fā)布日期: 2023-07-12
出版日期:

国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

引用本文

分享

文章指標

歷史

文章二維碼