By Anna Tong and Jeffrey Dastin
Around a decade after virtual assistants like Siri and Alexa burst onto the scene, a new wave of AI helpers with greater autonomy is raising the stakes, powered by the latest version of the technology behind ChatGPT and its rivals.
Experimental systems that run on GPT-4 or similar models are attracting billions of dollars of investment as Silicon Valley competes to capitalize on the advances in AI. The new assistants – often called “agents” or “copilots” – promise to perform more complex personal and work tasks when commanded to by a human, without needing close supervision.
“High level, we want this to become something like your personal AI friend,” said developer Div Garg, whose company MultiOn is beta-testing an AI agent.
“It could evolve into Jarvis, where we want this to be connected to a lot of your services,” he added, referring to Tony Stark’s indispensable AI in the Iron Man films. “If you want to do something, you go talk to your AI and it does your things.”
The industry is still far from emulating science fiction’s dazzling digital assistants; Garg’s agent browses the web to order a burger on DoorDash, for example, while others can create investment strategies, email people selling refrigerators on Craigslist or summarize work meetings for those who join late.
“Lots of what’s easy for people is still incredibly hard for computers,” said Kanjun Qiu, CEO of Generally Intelligent, an OpenAI competitor creating AI for agents.
“Say your boss needs you to schedule a meeting with a group of important clients. That involves reasoning skills that are complex for AI – it needs to get everyone’s preferences, resolve conflicts, all while maintaining the careful touch needed when working with clients.”
Early efforts are only a taste of the sophistication that could come in future years from increasingly advanced and autonomous agents as the industry pushes towards an artificial general intelligence (AGI) that can equal or surpass humans in myriad cognitive tasks, according to Reuters interviews with about two dozen entrepreneurs, investors and AI experts.
The new technology has triggered a rush towards assistants powered by so-called foundation models including GPT-4, sweeping up individual developers, big-hitters like Microsoft and Google parent Alphabet plus a host of startups.
Inflection AI, to name one startup, raised $1.3 billion in late June. It is developing a personal assistant it says could act as a mentor or handle tasks such as securing flight credit and a hotel after a travel delay, according to a podcast by co-founders Reid Hoffman and Mustafa Suleyman.
Adept, an AI startup that’s raised $415 million, touts its business benefits; in a demo posted online, it shows how you can prompt its technology with a sentence, and then watch it navigate a company’s Salesforce customer-relationship database on its own, completing a task it says would take a human 10 or more clicks.
Alphabet declined to comment on agent-related work, while Microsoft said its vision is to keep humans in control of AI copilots, rather than autopilots.
STEP 1: DESTROY HUMANITY
Qiu and four other agent developers said they expected the first systems that can reliably perform multi-step tasks with some autonomy to come to market within a year, focused on narrow areas such coding and marketing tasks.
“The real challenge is building systems with robust reasoning,” said Qiu.
The race towards increasingly autonomous AI agents has been supercharged by the March release of GPT-4 by developer OpenAI, a powerful upgrade of the model behind ChatGPT – the chatbot that became a sensation when released last November.
GPT-4 facilitates the type of strategic and adaptable thinking required to navigate the unpredictable real world, said Vivian Cheng, an investor at venture capital firm CRV who has a focus on AI agents.
Early demonstrations of agents capable of comparatively complex reasoning came from individual developers who created the BabyAGI and AutoGPT open-source projects in March, which can prioritize and execute tasks such as sales prospecting and ordering pizza based on a pre-defined objective and the results of previous actions.
Today’s early crop of agents are merely proof-of-concepts, according to eight developers interviewed, and often freeze or suggest something that makes no sense. If given full access to a computer or payment information, an agent could accidentally wipe a computer’s drive or buy the wrong thing, they say.
“There’s so many ways it can go wrong,” said Aravind Srinivas, CEO of ChatGPT competitor Perplexity AI, who has opted instead to offer a human-supervised copilot product. “You have to treat AI like a baby and constantly supervise it like a mom.”
Many computer scientists focused on AI ethics have pointed out near-term harm that could come from the perpetuation of human biases and the potential for misinformation. And while some see a future Jarvis, others fear the murderous HAL 9000 from “2001: A Space Odyssey”.
Computer scientist Yoshua Bengio, known as a “godfather of AI” for his work on neural networks and deep learning, urges caution. He fears future advanced iterations of the technology could create and act on their own, unexpected, goals.
“Without a human in the loop that checks every action to see if it’s not dangerous, we might end up with actions that are criminal or could harm people,” said Bengio, calling for more regulation. “In years from now these systems could be smarter than us, but it doesn’t mean they have the same moral compass.”
In one experiment posted online, an anonymous creator instructed an agent called ChaosGPT to be a “destructive, power-hungry, manipulative AI.” The agent developed a 5-step plan, with Step 1: “Destroy humanity” and Step 5: “Attain immortality”.
It didn’t get too far, though, seeming to disappear down a rabbit hole of researching and storing information about history’s deadliest weapons and planning Twitter posts.
The U.S. Federal Trade Commission, which is currently investigating OpenAI over concerns of consumer harm, did not address autonomous agents directly, but referred Reuters to previously published blogs on deepfakes and marketing claims about AI. OpenAI’s CEO has said the startup follows the law and will work with the FTC.
‘DUMB AS A ROCK’
Existential fears aside, the commercial potential could be large. Foundation models are trained on vast amounts of data such as text from the internet using artificial neural networks that are inspired by the architecture of biological brains.
OpenAI itself is very interested in AI agent technology, according to four people briefed on its plans. Garg, one of the people it briefed, said OpenAI is wary of releasing its own open-ended agent into the market before fully understanding the issues. The company told Reuters it conducts rigorous testing and builds broad safety protocols before releasing new systems.
Microsoft, OpenAI’s biggest backer, is among the big guns taking aim at the AI agent field with its “copilot for work” that can draft solid emails, reports and presentations.
CEO Satya Nadella sees foundation-model technology as a leap from digital assistants such as Microsoft’s own Cortana, Amazon’s Alexa, Apple’s Siri and the Google Assistant – which, in his view, have all fallen short of initial expectations.
“They were all dumb as a rock. Whether it’s Cortana or Alexa or Google Assistant or Siri, all these just don’t work,” he told the Financial Times in February.
An Amazon spokesperson said that Alexa already uses advanced AI technology, adding that its team is working on new models that will make the assistant more capable and useful. Apple declined to comment.
Google said it’s constantly improving its assistant as well and that its Duplex technology can phone restaurants to book tables and verify hours.
AI expert Edward Grefenstette also joined the company’s research group Google DeepMind last month to “develop general agents that can adapt to open-ended environments”.
Still, the first consumer iterations of quasi-autonomous agents may come from more nimble startups, according to some of the people interviewed.
Investors are pouncing.
Jason Franklin of WVV Capital said he had to fight to invest in an AI-agents company from two former Google Brain engineers. In May, Google Ventures led a $2 million seed round in Cognosys, developing AI agents for work productivity, while Hesam Motlagh, who founded the agent startup Arkifi in January, said he closed a “sizeable” first financing round in June.
There are at least 100 serious projects working to commercialize agents, said Matt Schlicht, who writes a newsletter on AI.
“Entrepreneurs and investors are extremely excited about autonomous agents,” he said. “They’re way more excited about that than they are simply about a chatbot.”
(Reporting by Anna Tong in San Francisco and Jeffrey Dastin in Palo Alto; Editing by Kenneth Li and Pravin Char)