RoskerTech

Apple unveils new AI model "ReALM" - potentially faster and smarter than Siri

General

Apple has introduced a new little language model called ReALM (Reference Resolution As Language Modeling) It runs on cell phones and is designed to make voice assistants like Siri smarter by allowing them to understand context and ambiguous references

This was announced ahead of the launch of iOS 18 at WWDC 2024 in June, which is expected to be the big push behind the new Siri 20, but it is not clear if this model will be integrated into Siri in time

This is not Apple's first foray into the artificial intelligence arena in recent months, with a mix of new models, tools to increase the efficiency of AI in smaller devices, and partnerships, all painting a picture of a company ready to make AI central to its business

ReALM is the latest announcement from Apple's rapidly growing AI research team, and the first to focus specifically on improving existing models to make them faster, smarter, and more efficient The company claims to outperform OpenAI's GPT-4 in certain tasks

Details were announced in a new Apple open research paper released Friday and first reported by Venture Beat on Monday Apple has yet to comment on the research or whether it will actually be part of iOS 18

Apple now seems to be taking a "throw everything at it and see what sticks" approach to AI There are rumors of partnerships with Google, Baidu, and even OpenAI The company has announced impressive models and tools to make AI easier to run locally

The iPhone maker has been working on AI research for over a decade, but much of it has been hidden in its apps and services It wasn't until the release of the latest MacBook models that Apple began using the word AI in its marketing, and that will only increase in the future

Much of the research is focused on ways to run AI models locally without resorting to sending large amounts of data to be processed in the cloud This is essential not only to reduce the cost of running AI applications, but also to meet Apple's strict privacy requirements

ReALM is smaller than models like GPT-4 because it does not have to do everything; ReALM's purpose is to provide context for other AI models like Siri

This is a visual model that reconstructs the screen and labels each entity and its location on the screen This creates a text-based representation of the visual layout that can be passed to the voice assistant, providing context clues to user requests

In terms of accuracy, Apple states that ReALM is comparable to GPT-4 on many key metrics, despite being smaller and faster

"We would especially like to emphasize the advantages obtained with the on-screen data set, which is a much more accurate and accurate data set than the GPT-4 We find that our model with the text encoding approach performs nearly as well as GPT-4, even though the latter is provided by screenshots," the authors write

What this means is that when a future version of ReALM is introduced to Siri, or even this version, Siri will be able to better understand the meaning of the word when the user says "open this app" or "tell me the meaning of this word in the image" understanding of the meaning of the word

It also gives Siri more conversational capabilities without having to fully deploy a large language model like Gemini

When coupled with other recent Apple research papers that allow for "one-shot" responses, where AI can get an answer from a single prompt, it is a sign that Apple is still investing heavily in the AI assistant field as well as relying on outside models