AI is making software bugs weirder and harder to fix
Published Date: 2/28/2024
Source: axios.com

Generative AI is raising the curtain on a new era of software breakdowns rooted in the same creative capabilities that make it powerful.

Why it matters: Every novel technology brings bugs, but AI's will be especially thorny and frustrating because they're so different from the ones we're used to.


Driving the news: AT&T's cellular network and Google's Gemini chatbot both went on the fritz last week.

  • In AT&T's breakdown, a "software configuration error" left thousands of customers without wireless service during their morning commute.
  • It was an annoying but familiar case of the kind of network outage that both the public and the tech industry learned to deal with long ago.

Google's bug was very different. Its Gemini image generator created a variety of ahistorical images: When asked to depict Nazi soldiers, it included illustrations of Black people in uniform; when asked to draw a pope, it produced an image of a woman in papal robes.

  • This was a more complex sort of error than AT&T's, at the boundary between engineering and politics, where it looked like a diversity policy had gone haywire. Google paused all AI generation of images of people until it could fix the problem.
  • Google CEO Sundar Pichai Tuesday sent a memo to staff saying of Gemini, "I know that some of its responses have offended our users and shown bias – to be clear, that's completely unacceptable and we got it wrong."

Conservatives shouted, "Woke!" But Google was taking precautions against bias much like those every AI company has learned to apply after years of embarrassing missteps — particularly in the realm of human images, where all AI terrorists seemed to have dark skins and every AI doctor was a white male.

  • Those problems have dogged the field because the data used to train generative AI models, typically culled from mountains of web pages, is full of the biases of the humans that created it.
  • AI firms could have reduced these problems by spending a lot of time and money up front to curate their models' training data. Instead, they are now applying spotty Band-Aid patches to the problems after the fact.

The big picture: In both the AT&T and the Google incidents, systems failed because of what people asked computers to do.

  • AT&T's wireless service crashed when it tried to follow new instructions containing an error or contradiction that caused the system to stop responding.
  • That's how most computer crashes have happened since computers were invented.

Yes, but: Most AI systems don't operate by commands and instructions — they use "weights" (probabilities) to shape output.

  • Developers can put their fingers on the scales — and clearly they did with Gemini. They just didn't get the results they sought.

Zoom in: Gemini isn't able to "think" clearly about the difference between historical research and "what if?" creativity.

  • A smarter system — or any reasonably well-informed human — would understand that the Roman Catholic Church has never had a pope who wasn't male. (There are lots of tales — all believed to be legend rather than fact — of a medieval "Pope Joan," and they likely pop up in the AI's training data, as they do in Wikipedia.)

Zoom out: Making things up is at the heart of what generative AI does.

  • Traditional software looks at its code base for the next instruction to execute; generative AI programs "guess" the next word or pixel in a sequence based on the guidelines people give them.
  • You can tune a model's "temperature" up or down, making its output more or less random. But you can't just stop it from being creative, or it won't do anything at all.

Between the lines: The choices AI models make are often opaque and even their creators don't fully understand how they work.

  • So when developers try to add "guardrails" to, say, diversify image results or limit political propaganda and hate speech, their interventions have unpredictable results and can backfire.
  • But not intervening will lead to biased and troublesome results, too.

What's next: The more we try to use generative AI systems to accurately represent reality and serve as knowledge tools, the more urgently the industry will try to place boundaries around their "hallucinations" and errors.

  • Generative AI is, as one expert put it soon after ChatGPT's debut, the greatest BS machine ever.
  • Over time, as the field learns from experience, AI programmers could succeed at taming and tuning their models to be more factual and less biased.
  • But it could turn out that using generative AI as the fundamental interface for knowledge work is just a very bad idea that will continue to bedevil us.

Editor's note: This story has been updated with Sundar Pichai's comments.