RoskerTech

StabilityAI Announces Stable Diffusion 3 - Does for AI Images What Sora Does for Video

General

The next generation of the popular open source AI image generation model, Stable Diffusion 3, has been announced by StabilityAI, and it is an impressive leap forward

Details of the new model were revealed along with a series of images and prompts showing that it can follow complex instructions and create ultra-realistic images

An early preview of the model will only be available to a select group of testers while StabilityAI gathers feedback to improve performance and safety prior to public release

StabilityAI also used Spawning's "Do Not Train" registry to ensure that images from artists who did not want their work to be used for AI training were excluded Prior to training, over 15 billion images were filtered from the dataset

Unlike DALL-E, MidJourney, and Google's Imagen Stable Diffusion, the model is open and can be integrated into other platforms or run locally if sufficient computing power is available

SD3 will include a complete set of models with between 800 million and 8 billion parameters, allowing it to run on a variety of quality levels and a wide range of hardware devices

Like OpenAI's Sora, Stable Diffusion 3 combines diffusion model technology with a transformer architecture, which can account for its improved instruction following capabilities

It also uses flow matching, a mathematical technique used to learn diffusion models, which measures the differences between real-world and generated images at various stages of the process

Few people outside of the development team have had direct access to Stable Diffusion 3 yet, and no research papers have yet been published

From what we have seen so far, this is an important step change in the generated image This, along with OpenAI's Sora, represents a major upgrade in the way generative AI works and its capabilities

It creates consistent, enhanced, readable text on the image, solves problems with human anatomy, including fingers, and appears to capture color well

Emad Mostaque, founder of StabilityAI, said that StabilityAI has 100 times fewer resources to train AI models than something like OpenAI, but still accomplishes impressive work He suggested that, like Sora, SD3 can accept a variety of inputs, including video and images

Details of SD3 were released a few days after StabilityAI announced Stable Cascade