Google I/O is almost here, and now that Google has already wrapped up The Android Show, all eyes are shifting toward the company’s AI ambitions — especially Gemini. While nothing has been officially announced yet, a new leak gives us an early glimpse of what Google could be preparing behind the scenes. The leak, shared by a user on X with the handle Waguri_Kaoruko8, showcases something called the “Gemini Spark Model,” alongside a new Agent or Chat Mode designed for more advanced tool-based actions. And honestly, this feels like Google trying to turn Gemini into a proper AI assistant that handles annoying digital chores for you.
The leak was later quote-posted by the AI News testing catalog account, which claimed there currently doesn’t seem to be support for importing SKILL MD files directly. Apparently, users may have to rely on the good old copy-paste method for now. The post also mentions there’s no sign of browser control or full computer-use capabilities yet — two features many people were hoping would eventually arrive as AI agents become more capable. But the screenshots themselves are where things get interesting.
Your inbox might soon fear Gemini more than spam
According to the leaked interface, the Gemini Spark Model — currently labeled as beta — appears focused on automation and personalization. One feature can reportedly clean up your inbox by summarizing newsletters, archiving clutter, and even automatically unsubscribing from mailing lists. Another tool can generate meeting briefs, pulling together relevant information and quick summaries before an important call or appointment. There’s also a custom news digest feature that seems designed to follow the stories you actually care about, rather than flooding you with random headlines all day. In a way, it feels like Google is pushing Gemini toward becoming a background productivity layer rather than just an AI you occasionally ask questions to. And frankly, that’s probably the smarter direction.
This shift reflects broader trends in the AI industry. Major players like OpenAI and Anthropic are also moving toward agentic systems that can execute tasks autonomously. For instance, OpenAI’s ChatGPT has introduced plugins and code interpreter capabilities that allow it to perform actions beyond simple conversation. Similarly, Anthropic’s Claude has shown promise in tool-use scenarios. Google’s move with the Gemini Spark Model is an attempt to catch up and differentiate by deeply integrating with the Google ecosystem, such as Gmail and Calendar, which already have billions of users. The potential seamless workflows — like having Gemini automatically unsubscribe from spam newsletters and then generate a daily digest of relevant tech news — could give users a reason to stay within Google’s suite of apps.
Google may be building a DIY AI workflow system
The leak also suggests users may be able to create custom “skills” for Gemini. The setup process reportedly involves giving the skill a title, explaining what it does, and adding instructions for how Gemini should behave. Think of it almost like building mini AI workflows without coding. This resembles the approach of platforms like Zapier or IFTTT, but with the AI handling the logic instead of rigid triggers and actions. For example, a user could create a skill that monitors a specific RSS feed, summarizes new articles in a note, and sends a digest to a designated Slack channel. The possibilities are vast, and the low-code approach makes it accessible to non-technical users.
However, the absence of full computer-use capabilities is notable. Rivals like Microsoft’s Copilot and even the experimental Claude Computer Use have shown that AI agents can control a browser or desktop environment to complete tasks. Google may be holding back on this for safety and reliability reasons — allowing an AI to directly manipulate a user’s entire system brings risks of errors and misuse. Starting with constrained tasks like email and news digests allows Google to gather feedback and refine the underlying architecture before expanding into more sensitive areas.
The timing of the leak is particularly interesting. Google I/O 2024 is expected to be heavily focused on AI, with Gemini as the star. Past I/O events have seen major announcements like the launch of Google Assistant and later the introduction of Bard (now Gemini). This year, the company is under pressure to demonstrate that it can lead the AI race, especially after the impressive performance of GPT-4 and the rapid adoption of ChatGPT. The Gemini Spark Model could be Google’s answer — not just a chatbot, but an integrated assistant that lives in your phone and proactively manages your digital life.
Of course, it’s important to keep expectations in check here. None of this is official yet, and leaks around Google I/O season tend to fly around fast. Still, the timing makes this particularly interesting. With Google expected to go all-in on Gemini at I/O next week, there’s a very real chance we could see at least some of these features become official sooner rather than later. The company has also been working on hardware improvements, including faster chips in Pixel phones that can run AI models locally for reduced latency and improved privacy. Combining on-device processing with cloud-based agentic skills could give Gemini an edge in responsiveness and data control.
Another aspect to consider is the potential impact on competitors. Google’s vast user base across Gmail, Calendar, Drive, and other services gives it a huge advantage if Gemini can seamlessly tap into those data sources. For example, a meeting brief feature could automatically pull relevant emails, documents, and calendar events without needing manual setup. This deep integration is something standalone apps like ChatGPT cannot easily replicate. However, it also raises privacy concerns — users may be wary of granting an AI such broad access to their personal data. Google will need to be transparent about how data is used and offer granular control over permissions.
In the broader context, the race to build the ultimate AI assistant is heating up. Apple is also reportedly working on a major Siri upgrade with large language models, and Amazon is revamping Alexa with generative AI. But Google’s approach with Gemini Spark Model seems to emphasize practicality over flashiness. Instead of trying to replace human workers, it focuses on reducing digital clutter and easing common pain points. This pragmatic strategy may resonate well with users who are overwhelmed by information overload.
From a technical standpoint, building an agentic model that can reliably perform tasks across different apps and services is incredibly challenging. It requires not just language understanding, but also planning, memory, and error recovery. The Gemini Spark Model label “beta” suggests that Google is aware of the limitations and is releasing it in an experimental state. Early adopters will likely encounter bugs and unexpected behaviors, but their feedback will be crucial in shaping the final product. Meanwhile, Google’s AI research division, DeepMind, has been making strides in reinforcement learning and multi-modal models, which could eventually power more advanced agentic capabilities.
One particularly intriguing aspect of the leak is the mention of “SKILL MD files.” While not yet supported for import, this suggests a future where users can download or share pre-built skills — similar to how plugins or extensions are distributed. This could create a whole ecosystem of third-party AI automations. Imagine a marketplace where developers publish skills for specific niches: social media monitoring, financial reporting, health tracking, etc. Google could leverage its existing developer community from Android and Chrome to jumpstart such an ecosystem. However, this would also require robust vetting processes to prevent malicious or poorly designed skills from causing harm.
The custom news digest feature also hints at a shift in how we consume information. Instead of relying on algorithmic feeds from social media or news aggregators, an AI assistant could curate a personalized briefing based on explicit interests and implicit reading habits. This could reduce echo chambers if done transparently, or exacerbate them if the AI only shows content that aligns with user biases. Google will need to balance personalization with diversity of viewpoints, a challenge that traditional search algorithms have faced for years.
As the I/O keynote approaches, anticipation is building. The leak has generated significant discussion in online tech communities, with many expressing excitement but also skepticism. Some question whether Google can execute effectively, given past stumbles with products like Google+ and Stadia. Others worry about the implications of ceding even more digital chores to AI. But one thing is clear: the era of static chatbots is ending, and agentic assistants are the next frontier. Google’s Gemini Spark Model, if real, represents a significant step toward that future. Whether it will live up to the hype remains to be seen, but the foundation is being laid for a more autonomous, helpful, and integrated AI experience on our phones.
Source: Digital Trends News