The long-awaited Google I/O2024 developer conference kicked off Tuesday with an expected slew of announcements covering AI, applications and future tools
The phrase AI was mentioned more than a hundred times on the day of the keynote alone But alongside all the Gemini hype, there are some really interesting video and image generation news coming out of google labs that portends the future
Note that the emphasis is on the future Sadly, the best of these "experiments" are currently only available on a wait and see basis Here's how to access already active tools and get to the waiting list for those models that have not yet been launched
At the top of the list is Google text to video's new VideoFX, which replaces OpenAI's Sora The new tool is based on Google DeepMind's Veo model, which allows users to generate 1080p quality video clips "over 1 minute" in length
The video clips on the Veo demo page are impressive and promise some cool upcoming features such as clip extensions, video from still images, and mask editing
Unfortunately, this new wonder is not yet available, you need to sign up for the waiting list of projects to gain access Google text to image model Imagen2 was launched 3 months ago and the technology is already available on the ImageFX website
However Adventurous Soul 3 models who want to test the brand new update Imagen, must join againYes, the waiting list of trusted testers
The results of the current picture are good, but certainly nothing to shout from the AI pioneers Here we expect the new version offers significant improvements
MusicFX was launched on May 12 last year and was a solid attempt at the AI text to music generator at the time
However, 5 months equals 2 AI lifetimes, and the tool is now hopelessly overtaken by newcomers like udio and Suno And things will probably only get worse with the upcoming release of ElevenLabs Music
Nevertheless, Google is bravely fighting to stay relevant, with an upgrade to MusicFX at this week's I/O showcase The new DJ mode allows you to mix various genres along with text prompts, using sliders to adjust the intensity of each
There's no vocal action yet, but when compared to the rich complexity of services like Udio, the results are good, but still sub-par The good news is that both DJ and creator modes are now available at the AI Test kitchen location
Overall, it was a hard 12 months for Google that increasingly looks like the flat-legged giant caught a nap at the AI kitchen table
The company is gradually offering new applications, but many of them seem to be a desperate attempt to catch up with more agile and creative rivals
In a keynote on the power of Gemini15 to provide multimodal speed to NotebookLM, a small statement at the bottom right of the presentation screen declared "Audio is pre-generated" Not a great look for the creators of DeepMind
Comments