RoskerTech

StabilityAI Releases Stable Audio 20 - All New Features Here

General

StabilityAI has announced the second version of its artificial intelligence music generation tool It offers longer tracks, audio-to-audio support, and a greater commitment to copyright protection for its creators

Stable Audio 20 allows users to input natural language processing prompts such as "beautiful piano arpeggio growing into a beautiful orchestral piece," "lo-fi funk," or "drum solo" to create a 3-minute track in 441 kHz stereo AI-generated tracks include structured compositions such as intros, expansions, outros, stereo sound effects, etc

Another new feature offered in Stable Audio 20 is the ability to upload audio files to the platform to generate "fully produced samples," evolving from a mere text-to-audio tool For example, if you imitate the sound of drums with your voice, the app will prompt you to create an audio clip of the drum performance

When using the new voice-to-audio feature, users must refrain from uploading copyrighted material in accordance with StabillityAI's Terms of Service StabillityAI uses content recognition technology to comply with this policy and to prevent copyright infringement to prevent copyright infringement

Like Stable Audio 10, the second model is trained on AudioSparx's vast library of audio files, consisting of 800,000 music tracks, sound effects, single instrument stems, and text-based metadata AudioSparx musicians are encouraged to use their unhappy with their work being used to train AI models, but such musicians are given the opportunity to refuse training

These copyright infringement and creator opt-out policy enhancements come on the heels of the recent departure of former VP of Audio Ed Newton-Rex He announced his resignation in November 2023 in an X post that was heavily critical of the company's approach to protecting creators' rights

"I have resigned from my role leading the audio team at StabilityAI because I disagree with the company that training generative AI models on copyrighted works is 'fair use,'" he wrote

He concluded his post by urging tech companies to express their concerns to creators so that they "realize that exploiting creators is not a long-term solution in generative AI"

In addition to support for longer tracks and audio-to-audio, Stable Audio 20 has an enhanced architecture that facilitates "generation of complete tracks with a coherent structure" By adapting all components of the system, they claim to have achieved "improved performance over long time scales"

The tool features a new type of compressed autoencoder that creates shorter audio representations by compressing raw audio waveforms Stable Diffusion 3 and similar diffusion transformers, on the other hand, can manipulate longer sequence data

"Combining these two elements results in a model that can recognize and reproduce the large-scale structures that are essential for high-quality music," Stability AI wrote in a blog post

The tool is free and ready to use