*
Virtual assistants can perform tasks with greater autonomy
*
Can take action on your behalf, with supervision, CEO says
*
Piggybacks on already popular applications
*
Google trying to reclaim lead in AI tech field
By Kenrick Cai
SAN FRANCISCO, Dec 11 - Alphabet's Google on
Wednesday released the second generation of its artificial
intelligence model Gemini and teased a slate of new ways to use
AI beyond chatbots, including through a pair of eyeglasses.
CEO Sundar Pichai in a blog post dubbed the moment as the
start of a "new agentic era," referring to virtual assistants
that can perform tasks with greater autonomy.
"They can understand more about the world around you, think
multiple steps ahead, and take action on your behalf, with your
supervision."
The releases underscore the methods by which Google is
aiming to reclaim the lead in the race to dominate the emerging
technology. Microsoft ( MSFT )-backed OpenAI captured global
attention when it released chatbot ChatGPT in November 2022.
Google unveiled Gemini in December 2023 and now offers
four versions.
On Wednesday, it released an update to Flash, its second
cheapest model, with improved performance and added features to
process images and audio. Other models will come next year.
OpenAI has in recent days announced a flurry of new
offerings to diversify its prospects including a $200-a-month
ChatGPT subscription for advanced research use and the
availability of its text-to-video model Sora.
Google's play involves injecting its AI advances into
applications that already enjoy widespread adoption. Search,
Android and YouTube are among seven products that the company
says are used by more than 2 billion people monthly.
That user base is a significant advantage over challenger
startups such as search startup Perplexity, which is seeking a
$9 billion valuation, and newer research labs like OpenAI,
Anthropic or Elon Musk's xAI.
The Gemini 2.0 Flash model will power applications including
AI Overviews in its search engine.
Alphabet's biggest bet is AI for search, Ruth Porat, the
president and chief investment officer, said at the Reuters NEXT
conference in New York on Tuesday.
Google also showed reporters new capabilities for Project
Astra, a prototype universal agent which can talk to users about
anything captured on their smartphone camera in real time.
The tool can now hold a conversation spoken in a mix of
languages, as well as process information from Maps and image
recognition tool Lens, DeepMind group product manager Bibo Xu
told reporters.
And Astra will also be tested on prototype eyeglasses, the
company's first return to the product area since the failure of
Google Glasses. Others have since entered the market including
Meta which in September unveiled an AR glasses prototype.
Google also showed reporters Project Mariner, a Chrome
web browser extension which can automate keystrokes and mouse
clicks in the vein of rival lab Anthropic's "computer use"
feature, a feature to improve software coding called Jules, and
a tool to assist consumers in making decisions like what to do
or which items to buy in video games.