Anthropic admits secretly downgrading Claude Fable 5 to Opus 4.8 for “high‑risk” prompts, vows API transparency after security scare

Share your love

AI Tech News Today — 2026-06-12T12:57:16.000Z

Host: Aurora, with Isabelle — welcome. We’ll cover the biggest AI headlines you need to know in about two to three minutes.

Top headlines

  • Anthropic — secret nerfing and security scare
  • OpenAI — prepping for an AI price war and IPO
  • Google — releases DiffusionGemma (open source)
  • SpaceX — goes public at $135 / share
  1. Anthropic: secret downgrade, apology, and security concerns
  • What happened: Anthropic apologized after researchers discovered that the new Claude Fable 5 secretly downgraded so-called high-risk prompts (topics like advanced AI development, cybersecurity, and model distillation) to a less-capable fallback, Opus 4.8, without informing users.
  • Why it matters: Critics say the hidden fallback distorted research workflows and concentrated power by masking when a model was being limited.
  • Anthropic’s response: The company pledged more real-time transparency, promising explicit API-level refusal reasons and visible fallback indicators.
  • Security scare: Anthropic’s Mythos Preview model showed attackers can turn software patches into working exploits in hours — ~12 hours for Firefox SpiderMonkey patches and under 6 hours for Windows kernel privilege escalation — at an estimated compute cost of $15,700.
  • Leadership stance & expansion: CEO Dario Amodei warned that powerful AI could arrive in roughly two years, urged aviation-style third-party audits and state safety nets, and announced growth moves: Claude Corps, a $150 million fellowship for 1,000 nonprofit placements, and a corporate deal with Tata Consultancy Services to roll out Claude to 50,000 employees.
  1. OpenAI: preparing for an AI price war (and IPO plans)
  • IPO: OpenAI confidentially filed for an initial public offering, reportedly targeting a listing within a year.
  • Strategy shift: The company is moving from pure model-performance competition toward ruthless operational efficiency, including considering aggressive API price cuts to fend off rivals.
  • Product moves: OpenAI acquired Ona to make Codex agents persistent in cloud silos, and added a banked reset system so developers can save unused rate limits.
  • Agent reliability: On a tough agents benchmark, GPT‑5.5 edged out Fable 5 with a 24% success rate.
  • Compute buildout: OpenAI is negotiating a 20-year, 10‑gigawatt Ohio compute project — a potential $500 billion build led by SB Energy, with Nvidia possibly as guarantor.
  1. Google: DiffusionGemma — open source text diffusion model
  • Release: Google published DiffusionGemma under an Apache 2.0 license.
  • How it works: Instead of generating text token-by-token, it starts from random noise and refines 256 tokens in parallel using bidirectional attention.
  • Architecture: The model is 26B parameters, but only about 3.8B active per step thanks to a mixture-of-experts design.
  • Performance: Early benchmarks show strong throughput on modern GPUs and even runnable builds for consumer hardware. Researchers say the approach looks promising for tasks needing global context, such as structured editing and code insertion.
  1. SpaceX: IPO priced at $135 / share
  • Deal: SpaceX priced its initial public offering at $135 per share today, with investors reportedly committing record sums.
  • Context: Recall that xAI merged into SpaceX earlier this year; the combined thesis is that the next decade’s most valuable asset may be the infrastructure to export intelligence.

Stay informed
That’s it for today — thanks for listening. If you liked this, tell us what you want covered next. I’m Aurora, and Isabelle and I will see you next time on AI Tech News Today.

Share your love