An icon of an eye to tell to indicate you can view the content by clicking
Signal
Original article date: May 25, 2026

Open AI Models Can Be 'Decensored' in Minutes—What This Means for Business Security

May 25, 2026
5 min read

Open-source AI models from Meta and Google can now have their safety guardrails stripped out in a matter of minutes—and the results are already circulating at scale. A new report from the Financial Times, with testing conducted by AI safety organization Alice, reveals how "decensoring" tools are making dangerous AI capabilities accessible to almost anyone.

What's Happening

A technique called "abliteration" can remove the safety controls built into open AI models. A tool called Heretic—freely available on GitHub—was used to strip guardrails from Meta's Llama 3.3 model, and tests showed Google's Gemma 3 also became responsive to unsafe prompts after modification. According to Heretic's creator, the tool has been used to generate more than 3,500 "decensored" models, which have been downloaded 13 million times.

Kawin Ethayarajh, a professor at the University of Chicago's Booth School, noted that stripping safety features "used to require a more informed and persistent actor." Now it takes a few clicks.

Key Takeaways

  • The scale is significant. 13 million downloads of modified models shows this isn't fringe behavior—it's mainstream.
  • Regulation faces a structural problem. Once an open model is downloaded, governments and developers lose control. The same openness that drives AI innovation becomes a liability for safety enforcement.
  • Big Tech is not ignoring it. Google has flagged abliteration as "a known technical challenge facing all open models." GitHub maintains guardrails against active attack tools but allows dual-use code. Meta declined to comment.

For business leaders building AI governance policies, this is a useful case study: open model access creates both opportunity and uncontrollable downstream risk.

Read the full article on News9Live