back Back

Meta has created an AI model, ‘SeamlessM4T,’ that can translate and transcribe close to 100 languages across text and speech

Aug. 24, 2023.
1 min. read Interactions

First all-in-one multilingual multimodal AI translation and transcription model," says Meta

About the Writer

Amara Angelica

114.05744 MPXR

Amara Angelica is Senior Editor, Mindplex

The model is “the first all-in-one multilingual multimodal AI translation and transcription model,” Meta claims.

“It can perform speech-to-text, speech-to-speech, text-to-speech, and text-to-text translations for up to 100 languages, depending on the task … without having to first convert to text behind the scenes, among other. We’re developing AI to eliminate language barriers in the physical world and in the metaverse.”

“In keeping with our approach to open science, we’re publicly releasing SeamlessM4T under a research license to allow researchers and developers to build on this work. We’re also releasing the metadata of SeamlessAlign, the biggest open multimodal translation dataset to date, totaling 270,000 hours of mined speech and text alignments.”

Competition

“Beyond the wealth of commercial services and open source models already available from Amazon, Microsoft, OpenAI and a number of startups, Google is creating what it calls the Universal Speech Model, a part of the tech giant’s larger effort to build a model that can understand the world’s 1,000 most-spoken languages,” says TechCrunch.

“Tech Mozilla meanwhile spearheaded Common Voice, one of the largest multi-language collections of voices for training automatic speech recognition algorithms. But SeamlessM4T is among the more ambitious efforts to date to combine translation and transcription capabilities into a single model.

In developing it, Meta says that it scraped publicly available text (on the order of ‘tens of billions’ of sentences) and speech (4 million hours) from the web.”

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Comment on this content

2 Comments

2 thoughts on “Meta has created an AI model, ‘SeamlessM4T,’ that can translate and transcribe close to 100 languages across text and speech

  1. Nice.

    Like
    Dislike
    Share
    Reply
  2. Great model

    Like
    Dislike
    Share
    Reply
Like
Dislike
Share

2

Comments
Reactions
💯 💘 😍 🎉 👏
🟨 😴 😡 🤮 💩

Here is where you pick your favorite article of the month. An article that collected the highest number of picks is dubbed "People's Choice". Our editors have their pick, and so do you. Read some of our other articles before you decide and click this button; you can only select one article every month.

People's Choice
Bookmarks