SeamlessM4T v2- 即時語音翻譯模型

無縫 M4T

Meta 三個月前公佈的 SeamlessM4T (Massively Multilingual and Multimodal Machine Translation model) ,目前已更新到 v2,於 GitHub 開放下載最新的源碼。SeamlessM4T v2 採用 UnitY2 架構的更新版本。與 SeamlessM4T v1 相比,此新模型在品質以及語音生成任務中的推理延遲方面有所改進。

M4T 是一體式大規模多語言和多模式的機器翻譯模型可為近 100 種語言的語音和文字提供高品質翻譯。

SeamlessM4T 模型支援以下任務:

  • 語音轉語音翻譯 (S2ST)
  • 語音轉文字翻譯 (S2TT)
  • 文字轉語音翻譯 (T2ST)
  • 文本到文本翻譯 (T2TT)
  • 自動語音辨識 (ASR)

下面列出了 SeamlessM4T-large (v1/v2) 支援的語言。 來源列指定是否支援某種語言作為來源語音 (Sp) 和/或來源文字 (Tx)。 目標列指定是否支援某語言作為目標語音 (Sp) 和/或目標文字 (Tx)。可惜暫時未見有廣東話 tts!

編碼語言script來源目標
afrAfrikaansLatnSp, TxTx
amhAmharicEthiSp, TxTx
arbModern Standard ArabicArabSp, TxSp, Tx
aryMoroccan ArabicArabSp, TxTx
arzEgyptian ArabicArabSp, TxTx
asmAssameseBengSp, TxTx
astAsturianLatnSp
azjNorth AzerbaijaniLatnSp, TxTx
belBelarusianCyrlSp, TxTx
benBengaliBengSp, TxSp, Tx
bosBosnianLatnSp, TxTx
bulBulgarianCyrlSp, TxTx
catCatalanLatnSp, TxSp, Tx
cebCebuanoLatnSp, TxTx
cesCzechLatnSp, TxSp, Tx
ckbCentral KurdishArabSp, TxTx
cmnMandarin ChineseHansSp, TxSp, Tx
cmn_HantMandarin ChineseHantSp, TxSp, Tx
cymWelshLatnSp, TxSp, Tx
danDanishLatnSp, TxSp, Tx
deuGermanLatnSp, TxSp, Tx
ellGreekGrekSp, TxTx
engEnglishLatnSp, TxSp, Tx
estEstonianLatnSp, TxSp, Tx
eusBasqueLatnSp, TxTx
finFinnishLatnSp, TxSp, Tx
fraFrenchLatnSp, TxSp, Tx
fuvNigerian FulfuldeLatnSp, TxTx
gazWest Central OromoLatnSp, TxTx
gleIrishLatnSp, TxTx
glgGalicianLatnSp, TxTx
gujGujaratiGujrSp, TxTx
hebHebrewHebrSp, TxTx
hinHindiDevaSp, TxSp, Tx
hrvCroatianLatnSp, TxTx
hunHungarianLatnSp, TxTx
hyeArmenianArmnSp, TxTx
iboIgboLatnSp, TxTx
indIndonesianLatnSp, TxSp, Tx
islIcelandicLatnSp, TxTx
itaItalianLatnSp, TxSp, Tx
javJavaneseLatnSp, TxTx
jpnJapaneseJpanSp, TxSp, Tx
kamKambaLatnSp
kanKannadaKndaSp, TxTx
katGeorgianGeorSp, TxTx
kazKazakhCyrlSp, TxTx
keaKabuverdianuLatnSp
khkHalh MongolianCyrlSp, TxTx
khmKhmerKhmrSp, TxTx
kirKyrgyzCyrlSp, TxTx
korKoreanKoreSp, TxSp, Tx
laoLaoLaooSp, TxTx
litLithuanianLatnSp, TxTx
ltzLuxembourgishLatnSp
lugGandaLatnSp, TxTx
luoLuoLatnSp, TxTx
lvsStandard LatvianLatnSp, TxTx
maiMaithiliDevaSp, TxTx
malMalayalamMlymSp, TxTx
marMarathiDevaSp, TxTx
mkdMacedonianCyrlSp, TxTx
mltMalteseLatnSp, TxSp, Tx
mniMeiteiBengSp, TxTx
myaBurmeseMymrSp, TxTx
nldDutchLatnSp, TxSp, Tx
nnoNorwegian NynorskLatnSp, TxTx
nobNorwegian BokmålLatnSp, TxTx
npiNepaliDevaSp, TxTx
nyaNyanjaLatnSp, TxTx
ociOccitanLatnSp
oryOdiaOryaSp, TxTx
panPunjabiGuruSp, TxTx
pbtSouthern PashtoArabSp, TxTx
pesWestern PersianArabSp, TxSp, Tx
polPolishLatnSp, TxSp, Tx
porPortugueseLatnSp, TxSp, Tx
ronRomanianLatnSp, TxSp, Tx
rusRussianCyrlSp, TxSp, Tx
slkSlovakLatnSp, TxSp, Tx
slvSlovenianLatnSp, TxTx
snaShonaLatnSp, TxTx
sndSindhiArabSp, TxTx
somSomaliLatnSp, TxTx
spaSpanishLatnSp, TxSp, Tx
srpSerbianCyrlSp, TxTx
sweSwedishLatnSp, TxSp, Tx
swhSwahiliLatnSp, TxSp, Tx
tamTamilTamlSp, TxTx
telTeluguTeluSp, TxSp, Tx
tgkTajikCyrlSp, TxTx
tglTagalogLatnSp, TxSp, Tx
thaThaiThaiSp, TxSp, Tx
turTurkishLatnSp, TxSp, Tx
ukrUkrainianCyrlSp, TxSp, Tx
urdUrduArabSp, TxSp, Tx
uznNorthern UzbekLatnSp, TxSp, Tx
vieVietnameseLatnSp, TxSp, Tx
xhoXhosaLatnSp
yorYorubaLatnSp, TxTx
yueCantoneseHantSp, TxTx
zlmColloquial MalayLatnSp
zsmStandard MalayLatnTxTx
zulZuluLatnSp, TxTx

請注意,seamlessM4T-medium 在文字模式中支援 200 種語言,並且基於 NLLB-200(請參閱資產卡中的完整清單

Hugging Face Demo (A100 GPU)


Popular Tags