基于草圖引導的少樣本說(shuō)話(huà)人視頻生成算法研究

首頁(yè) > 過(guò)刊瀏覽>2024年第32卷第10期 >236-242

基于草圖引導的少樣本說(shuō)話(huà)人視頻生成算法研究
DOI:
                        
CSTR:
                        
作者:
                        
作者單位:上海大學(xué) 通信與信息工程學(xué)院
作者簡(jiǎn)介:
通訊作者:
中圖分類(lèi)號:TP37
基金項目:國家自然科學(xué)基金(61871262)

Research on Few-Shot Talking Head Video Generation Algorithm Guided by Sketches

Author:

Affiliation:

Fund Project:

摘要

圖/表

訪(fǎng)問(wèn)統計

參考文獻

相似文獻

引證文獻

資源附件

文章評論

摘要:

說(shuō)話(huà)人視頻生成需要對面部紋理和驅動(dòng)語(yǔ)音進(jìn)行精準聯(lián)合建模；為實(shí)現該目標，對語(yǔ)義引導的紋理特征形變進(jìn)行了研究，提出一種基于草圖引導的少樣本說(shuō)話(huà)人視頻生成框架，采用雙階段生成技術(shù)進(jìn)行模態(tài)對齊；在第一階段使用真實(shí)先驗關(guān)鍵點(diǎn)信息進(jìn)行語(yǔ)音到目標關(guān)鍵點(diǎn)的生成，第二階段將關(guān)鍵點(diǎn)轉化為草圖作為中間表征與參考圖片進(jìn)行語(yǔ)義對齊；草圖的引入有效地解決了語(yǔ)音與圖像的模態(tài)不匹配問(wèn)題；通過(guò)實(shí)驗測試，算法在公開(kāi)數據集HDTF和MEAD上的FID指標達到了15.676和8.618；經(jīng)上述結果驗證，提出的算法可通過(guò)中間表征有效建模目標音頻驅動(dòng)下的面部紋理，達到與最先進(jìn)算法相當的生成效果。

Abstract:

Talking face generation requires precise joint modeling of facial texture and driven audio; to achieve this goal, research on semantic-guided texture feature deformation has been conducted, proposing a sketch-guided few-shot speaker video generation framework, employing dual-stage generation techniques for modality alignment. In the first stage, real prior facial landmarks information is used to generate the target facial landmarks from audio, and in the second stage, facial landmarks are transformed into sketches as intermediate representations for semantic alignment with reference images. Introduction of sketches effectively addresses the modality mismatch between audio and images; through experimental testing, the algorithm achieves FID scores of 15.676 and 8.618 on the public datasets HDTF and MEAD respectively. The proposed algorithm effectively models facial texture under the drive of target audio through intermediate representations, achieving comparable results to state-of-the-art algorithms as validated by the aforementioned results.

參考文獻

相似文獻

引證文獻

引用本文

魏清楊,徐樹(shù)公.基于草圖引導的少樣本說(shuō)話(huà)人視頻生成算法研究計算機測量與控制[J].,2024,32(10):236-242.

復制

文章指標

點(diǎn)擊次數:
下載次數:
HTML閱讀次數:
引用次數:

歷史

收稿日期:2024-04-28
最后修改日期:2024-05-09
錄用日期:2024-05-09
在線(xiàn)發(fā)布日期: 2024-10-30
出版日期:

国产欧美精品一区二区,中文字幕专区在线亚洲,国产精品美女网站在线观看,艾秋果冻传媒2021精品,在线免费一区二区,久久久久久青草大香综合精品,日韩美aaa特级毛片,欧美成人精品午夜免费影视

引用本文

分享

文章指標

歷史

文章二維碼