The brave new world of large language models (LLMs) has finally dealt us a space angle. This week, accompanying a Paris event to preview its new Bard chatbot, Google released an ad about the ChatGPT competitor. The issue? Bard confidently shared an incorrect answer about JWST.
The incorrect answer, first shared Monday but widely circulated Wednesday, sent shares of Google parent Alphabet ($GOOG) tumbling.
The fault, dear Google…As part of a demo video for BardAI, Google posed the question, “What new discoveries from the James Webb Space Telescope can I tell my 9 year old about?” In response, the AI announced that JWST “took the very first pictures of a planet outside our solar system.”
The problem, really, lies not in the facts, but in ourselves the phrasing. JWST took the first images of HIP 65426 b, an exoplanet that had not been imaged before, last year. The first image of any exoplanet was taken in 2004 by the European Southern Observatory’s Very Large Telescope.
All that glitters is not gold: ChatGPT, BardAI, and other LLMs are impressive tools and respond to users’ queries very confidently, but they’re frequently flat-out wrong.
“It perfectly shows the most important weakness of statistical systems,” Carissa Véliz at the University of Oxford told New Scientist. “These systems are designed to give plausible answers, depending on statistical analysis—they’re not designed to give out truthful answers.”
Still, chatbot interfaces are catching on. On Tuesday, Microsoft announced that it is integrating a souped-up version of ChatGPT with Bing search and the Edge browser to return more tailored answers to queries. Google has similar plans with Bard. Not to be outdone, Chinese search giant Baidu has its own LLM plans in the works.
What’s past is prologue: Bard’s JWST flub wiped as much as $100B from Alphabet’s market cap yesterday. Now, Google is looking to make sure it doesn’t repeat its mistake.
“This highlights the importance of a rigorous testing process, something that we’re kicking off this week with our Trusted Tester program,” a Google spokesperson told New Scientist. “We’ll combine external feedback with our own internal testing to make sure Bard’s responses meet a high bar for quality, safety and groundedness in real-world information.”
The best advice comes from the Bard himself: “Go wisely and slowly. Those who rush stumble and fall.”
+ While we’re here: ChatGPT is struggling with rocket science…