OpenAI has warned that its upcoming AI fashions may reveal ‘excessive’ ranges of capabilities in cybersecurity and pose severe dangers in the event that they have been to be probably misused.
The following-generation of AI fashions may, as an illustration, be used to remotely deploy zero-day exploits towards well-defended methods or allow risk actors to compromise complicated, enterprise operations resulting in real-world influence, the ChatGPT maker stated in a weblog publish on Wednesday, December 10.
On its half, OpenAI stated that it’s investing in strengthening its fashions for defensive cybersecurity duties together with creating instruments that allow cybersecurity groups to audit code and patch vulnerabilities extra simply. “Our aim is for our fashions and merchandise to carry important benefits for defenders, who are sometimes outnumbered and under-resourced,” the corporate stated.
OpenAI just isn’t the one one which seems to be tamper-proofing its personal AI fashions and instruments in anticipation of a future with frequent and extra subtle AI-led cybersecurity threats. Earlier this week, Google introduced it’s upgrading its Chrome browser safety structure towards oblique immediate injection assaults that could possibly be used to hijack AI brokers – forward of rolling out Gemini agentic capabilities in Chrome extra extensively.
In November 2025, Anthropic disclosed that risk actors, probably a Chinese language state-sponsored group, had manipulated its Claude Code device to hold out a extremely subtle AI-led espionage marketing campaign that was disrupted by the AI startup.
To focus on how shortly AI’s cybersecurity capabilities have superior, OpenAI stated that GPT‑5.1-Codex-Max scored 76 per cent on capture-the-flag (CTF) challenges final month, up from 27 per cent by GPT‑5 in August this 12 months.
Layered security stack
To mitigate the dangers, OpenAI stated it’s taking a defense-in-depth strategy which entails a mixture of entry controls, infrastructure hardening, egress controls, and monitoring. By way of extra concrete steps, the Microsoft-backed AI startup stated it’s:
Story continues beneath this advert
– Coaching AI fashions to refuse or safely reply to dangerous requests whereas remaining useful for instructional and defensive use instances.
– Enhancing system-wide monitoring throughout merchandise that use frontier fashions to detect probably malicious cyber exercise.
– Working with professional crimson teaming organisations to guage and enhance security mitigations.
Aardvark, its AI agent designed to double as a safety researcher, is at the moment in non-public beta. Aardvark is able to scanning codebases for vulnerabilities and proposes patches that maintainers can undertake shortly. It is going to be made out there without spending a dime to pick out non-commercial open supply repositories, OpenAI stated.
As for broader ecosystem-focused initiatives, OpenAI stated it’ll arrange a Frontier Threat Council, an advisory group comprising exterior cybersecurity specialists, together with a trusted entry programme for customers and builders.

