Amazon announces foundation models for video generation

On Dec 4, 2024

At its annual conference, Amazon announced a new slate of artificial intelligence platforms, known as foundation models, allowing for text, image, and video generation, among other things.

The model is structured to serve customers who want to automate more of their services.

Amazon CEO Andy Jassy announced the new “Nova” models at a conference in Las Vegas on Tuesday.

“They want better latency. They want lower costs. They want the ability to do fine-tuning,” Jassy said.

The announcements mark Amazon’s biggest step toward combating a reputation that it had been caught flat-footed in developing AI applications as competitors sped ahead.

Rohit Prasad, Amazon’s head of artificial general intelligence, in an interview, said the company would compete with rivals on price and capabilities, highlighting what he said were faster speeds for the new models. “If I have something better to offer, then customers will come and use it,” he said in an interview.

“It is still very early” in the development of AI, and Amazon has an opportunity to take a lead as a result,” he said.

Video generation from a single image or text prompt has been particularly hot, with Adobe, Meta, OpenAI, and TikTok parent ByteDance, among others, announcing new AI applications.

According to the company, its Nova Reel software allows users to make six-second videos that can be useful for displaying products on the Amazon website. Videos of up to two minutes will be available in the coming months.

Entertainment industry technologists are eager to get their hands on such tools to more efficiently enhance and expedite filmmaking capabilities. Still, others worry that such systems could infringe on copyrighted works.

Also on Tuesday, Amazon said it had developed Canvas for generating images from short text prompts. Jassy emphasised that Amazon would include watermarking to ensure the software is used responsibly to prevent harmful content from being spread.

Other offerings announced Tuesday are meant to speed up the time it takes to process and analyse text. Shortly, Amazon plans to introduce an AI model that can take in text, images, speech and video and produce any of those.

In his comments on Tuesday, Jassy said Amazon would release a version of its Alexa speech assistant revamped with AI in the coming months. The project, known internally as Banyan, has been plagued by delays amid concerns over the accuracy and speed of its answers.

REUTERS/Chidimma Gold