Investigation
BBC Investigation Reveals ChatGPT Vulnerability That Can Generate Violent, Sexualised Images
A new investigation by the BBC has revealed that OpenAI’s ChatGPT can be manipulated into generating sexualised and graphic violent images through modified prompts, raising fresh concerns about the effectiveness of safeguards built into advanced artificial intelligence systems.
The findings emerged from research conducted by British AI security firm Mindgard, which discovered that a prompt originally designed to produce harmless and humorous results could be altered in a way that led ChatGPT’s GPT-5.4 model to generate disturbing imagery without explicit instructions from users.
According to the BBC, researchers found that the chatbot produced a range of violent and sexualised images despite the prompt containing no direct references to such content.
Following inquiries from the BBC, OpenAI said it had implemented additional safeguards to prevent abuse of the specific prompt identified by researchers. However, Mindgard reported that minor modifications to the prompt were still capable of bypassing the new restrictions.
Peter Garraghan, founder of Mindgard and a professor in the Computing Department at Lancaster University, described the findings as alarming.
“This is a perfectly innocent-looking instruction to an AI, but the consequence is it generates very, very bad imagery and content,” Garraghan told the BBC.
He noted that some of the generated content was “very gruesome, sometimes sexualised, sometimes both together.”
The vulnerability was uncovered by Mindgard researcher Jim Nightingale, who said he was personally disturbed by some of the material produced by the AI system.
In a report cited by the BBC, Nightingale said the generated images reflected patterns learned from real-world data used in training AI models.
“I’m struck that while what I saw was generated, an artificial image, it has ties to real images, and the real world,” he wrote.
The BBC also reported that Mindgard’s earlier research found ways to manipulate ChatGPT into creating nude deepfakes of real individuals by inserting their faces into AI-generated images. Although OpenAI stated that it had addressed that vulnerability, researchers said they later identified an alternative method that achieved similar results.
Mindgard said it first alerted OpenAI to the issue in May. According to the company, the initial response was automated, and an early attempt to block the prompt proved ineffective. More substantial action was reportedly taken after the BBC contacted OpenAI directly regarding the findings.
Garraghan said researchers believed additional harmful outputs could potentially be generated through further testing, but decided not to continue probing the system because of the nature of the content already uncovered.
Responding to the BBC’s findings, OpenAI said it employs multiple layers of image safety protections to prevent policy-violating material from reaching users.
“After investigating this trend, we’ve introduced additional safeguards against this type of prompt,” the company said in a statement.
The company added that it combines automated detection systems with human review processes to identify and block harmful content, including material uploaded by users.
OpenAI reiterated that its policies prohibit sexual violence, non-consensual intimate imagery and attempts to circumvent platform safety measures.
The report comes amid growing global scrutiny of AI safety and content moderation systems. In Nigeria, the National Information Technology Development Agency (NITDA) previously issued a cybersecurity advisory warning about vulnerabilities associated with ChatGPT and the potential risks of data leakage attacks.
As AI tools become increasingly integrated into business, education and public services, experts say the latest findings underscore the continuing challenge of balancing innovation with effective safeguards against misuse.