AI Safety in 2023: analysts claim issues up 700% in 2023

by Stephen M. Walker II, Co-Founder / CEO

Later this week I'm meeting with UK Officials, providing commentary on AI Safety ahead of the UK's AI Safety Summit in November. While I have some strong opinions about AI Safety, I'll put those aside for now and recap notes from 2023.

To set the foundation: most headlines this year are inaccurate, sometimes accidentally, other times repeated for traffic and ad revenue – looking at you Forbes. Safety issues are not skyrocketing and let's unpack what actually happened this year.

The leading public AI safety database is Incident Database. This database reported 91 incidents in 2022, 96 for 2023, and we'll likely see a total increase of reported events around 15-20% for the year.

Analyzing incidents, here's what I found:

  • At least 6 incidents are claims later proven to be false (eg. Cruise fake news)
  • 7 are related to physical harm or safety with autonomous systems failing
  • 11 are related to data privacy leaks, connected to AI teams, but unrelated to AI technologies
  • 26 are related to impersonation or deep fakes, technologies not connected to GPT-4 size models
  • The remainder are a mix of quality issues, hallucinations, potential bias, or moderation issues – issues, we should expect from an emerging technology

Safety risks are real, especially for ML operating in the physical world. But everyone from Time Magazine to the White House primarily focused on Large Language Models (LLMs) this year.

Time Magazine went as far as publishing a crackpot calling for a ban – with offensive military strikes on offending labs – training GPT-4-scale models. This came as a response to scientists calling for a pause in GPT-4 scale model training.

Keep in mind that at this time the general public did not actually know the actual size or scale of GPT-4 – this narrative was built on imagined, estimated information from 2020.

At Klu, we noted two AI system vulnerabilities this year that enable attackers to bypass alignment or hijack prompts, but I'm not aware of any real-world incidents using these exploits.

To put it plainly: today's AI headlines are typically referring to Generative AI and not classical ML models, with a heavy fear placed on the world-ending power and scale of GPT-4, but with the actual risks coming from the technologies or labs not present at the US Congressional AI hearings.

A definition of AI in 2023

This year the two letters “AI” were applied to everything from autonomous software agents to vision models powering self-driving cars. But in most cases, AI merely referred to Generative AI.

First: Generative AI is not aware or thinking for itself or plotting world domination. GenAI models are merely the probabilistic relationship between inputs and outputs, with Foundational Models spanning language, audio, and visuals. These models are essentially powerful autocompletes, trained on vast datasets to generate common, likely content, and refined further to match the outputs to our expectations. They are exciting because for the first time, average model performance matches or exceeds humans.

A great deep dive on transformer models can be found here.

A definition of safety

AI is coming for us. Or, at least that's the narrative we know from blockbusters like The Matrix or Terminator 2. They're great movies, but not documentaries.

Robustness is the more accurate term for safety in real-world models – that is, how does the model behave under unforeseen circumstances. After all, it is impossible to test for every known scenario in a lab.

Autonomous driving and ML-controlled weapons systems offer the highest potential risk to human safety due to the nature of both driving and military conflict.

AI Safety for LLMs typically focuses on reducing bias and aligning outputs to American societal values. For LLM/FMs or products/services using them, "safety" often fits into these categories:

  • Is my personal data secure
  • Can I trust the content in the news
  • Can I trust the content I generate in apps
  • Are bad actors empowered by this technology

Alignment is a sub-field focused on: Does the content I generate match my expectations? Models are probabilistic, which said differently just means: your results will vary.

Imagine Texas Instruments shipping calculators in 1972 that generated different answers each time. You likely would not rely on this new tool until its answers aligned with your expectations.

To comment on the first three safety categories, Generative AI merely represents another vector in the stack of potential risks for security and trust. A system's complexity is directly proportional to potential flaws or risks. You shouldn't trust GenAI content any more than you trust an "official" narrative or a Wikipedia page.

When it comes to bad actors, GenAI offers them the same productivity increases as the rest of society. Did the printing press, radio, and internet make it easier for bad actors to have negative consequences on society? Likely yes, but we adapted as a people. I spent a few hours one summer night debunking a story on WormGPT, a headline that claimed hackers unleashed a jailbroken ChatGPT. In reality: someone on a hacker chatbot posted an offer for software that claimed it would generate zero-day exploits that you could sell on that board.

Leading GenAI labs take alignment and safety very seriously with Anthropic, Midjourney and OpenAI leading the way in state-of-the-art techniques. But with all of this work there are two known system issues, very similar to classical computing's buffer overflows: adversarial suffixes and prompt injection. Both received little to no attention this year and are the most credible risks for the abuse of AI systems.

2023's safety threats and regulatory capture

After the letter calling for a pause of LLM models, Washington summoned the leading CEOs from LLM labs then went to Washington – with a mostly singular message: please regulate us. Just a few months later those same labs, plus Anthropic and Huggingface participated in an event with the focus on "AI Innovation that Protects Americans' Rights and Safety."

It's hard not to review the facts of 2023 and not think that we were all played. If AI is a great risk, and GPT-4 is taking over the world... then where are makers of the technology behind HQ misinformation via deepfakes, identity theft, or autonomous driving snafus?

From a safety perspective, ChatGPT garnered more headlines for its ability to hallucinate court cases and write (or not write) student homework assignments.

A clear-eyed skeptic might think we're watching the early innings of smart tech CEOs fast track regulatory capture.

What does 2024 hold?

I agree with Sam Altman on one risk: the need for UBI as a stop gap for rethinking how our economy works in a world of abundance.

At Klu, we work with customers building products and workflows that accomplish the tasks of a typical knowledge worker. But instead of taking weeks to complete a task, GenAI models finish a first draft in 10-15 minutes at a cost of around $10 in token usage. Not simple paragraphs or poems, but 60-page strategy assessments pulling in 20+ data sources.

When we live in a world where AI systems are as good as the average worker, what happens to those jobs?

I believe we'll see a major shift in the number of administrators and managers in corporate America, with a larger shift toward a few coaches, many player-captains, and AI-empowered workers. I believe these workers will focus on going from C-player to B-player work outputs, instead of starting from a blank page every time.

AI systems improve at a rate faster than humans learn or reskill. IBM paused hiring 7800 back office roles to replace workload with AI systems. Over the next few years most organizations will slow and stop hiring for administrative, back office roles, and push for higher productivity per worker. It's an opportunity for those wanting to do great work, but it's a layoff notice for the unmotivated.

Thankfully, we elect only the most exceptional and thoughtful members of our society into government, and we can rely on them to handle the centuries old issue of technological unemployment.

More articles

Optimizing LLM Apps

This comprehensive guide provides developers, product managers, and AI Teams with a structured framework for optimizing large language model (LLM) applications to achieve reliable performance. It explores techniques like prompt engineering, retrieval-augmented generation, and fine-tuning to establish strong baselines, fill knowledge gaps, and boost consistency. The goal is to systematically evaluate and improve LLM apps to deliver delightful generative experiences.

Read more

Best Open Source LLMs of 2024

Open source LLMs like Llama 2, Mistral 7B, and Falcon are bringing advanced AI capabilities into the public domain. This guide explores the best open source LLMs and variants for capabilities like chat, reasoning, and coding while outlining options to test models online or run them locally and in production.

Read more

It's time to build

Collaborate with your team on reliable Generative AI features.
Want expert guidance? Book a 1:1 onboarding session from your dashboard.

Start for free