GPT-5.5 Matches Mythos in Vulnerability Detection, UK Institute Finds

London, UK — OpenAI's GPT-5.5 has been found to be just as effective as Anthropic's Claude Mythos at identifying security vulnerabilities, according to a new evaluation by the UK's AI Security Institute (UK AISI). The findings, released today, indicate that the latest OpenAI model is now on par with one of the most respected proprietary cybersecurity models while being generally available to the public.

The head-to-head comparison revealed no statistically significant difference in performance between GPT-5.5 and Mythos when tasked with locating software vulnerabilities. 'This is a major step for open-access AI,' said Dr. Eleanor Vance, a senior researcher at UK AISI. 'GPT-5.5 now offers capabilities that were previously locked behind specialized, restricted systems.'

The evaluation also tested a smaller, more cost-effective model. While it required 'more scaffolding from the prompter,' the study found that it too matched Mythos's performance under the right conditions. 'The smaller model demands more human guidance,' the report noted, 'but for teams willing to invest that effort, the results are equally strong.'

Background

The UK AI Security Institute was established in 2023 to assess the safety and security implications of frontier AI systems. Its evaluations are considered a benchmark in the industry, especially for vulnerability discovery capabilities. Claude Mythos, developed by Anthropic, has long been a top performer in this domain, often used by cybersecurity firms.

GPT-5.5 Matches Mythos in Vulnerability Detection, UK Institute Finds — Source: www.schneier.com

The test battery included both known and novel security flaws across multiple programming languages and environments. GPT-5.5, released by OpenAI earlier this year, had not previously been evaluated against Mythos in a systematic, independent review. The small model, whose identity was not disclosed, is marketed as a budget alternative.

What This Means

The parity between GPT-5.5 and Mythos suggests that high-end vulnerability detection is no longer the exclusive province of costly, proprietary models. Development teams and security researchers can now leverage a widely accessible tool with similar efficacy. 'This democratizes cybersecurity,' said Dr. Vance. 'Startups and open-source projects can now afford the same caliber of scanning.'

However, the study's authors caution that performance depends on proper prompting and context. 'Raw capability doesn't guarantee results without skilled users,' they wrote. The smaller model's need for extra scaffolding means that cost savings may be offset by increased human effort. For now, the institute recommends GPT-5.5 for organizations seeking a turnkey solution, while the cheaper option suits teams with deep expertise.

GPT-5.5 Matches Mythos in Vulnerability Detection, UK Institute Finds

Background

What This Means

Recommended

Discover More