OpenAI GPT-5.4-Cyber: Trusted Access for Defenders
OpenAI expands its Trusted Access for Cyber program with GPT-5.4-Cyber, giving vetted security teams early access to its most capable cybersecurity AI model yet.
OpenAI expands its Trusted Access for Cyber program with GPT-5.4-Cyber, giving vetted security teams early access to its most capable cybersecurity AI model yet.
OpenAI’s Child Safety Blueprint sets new standards for protecting minors online. Here’s what’s in it, why it matters, and whether it’s enough.
OpenAI’s new Safety Fellowship funds independent alignment researchers outside its walls. Here’s what the pilot program offers and why it matters now.
OpenAI launches a Safety Bug Bounty program targeting prompt injection, agentic abuse, and data exfiltration. Here’s what it covers and why it matters.
OpenAI’s foundation is committing at least $1 billion to cure diseases, expand economic opportunity, and build AI resilience. Here’s what that actually means.
OpenAI releases prompt-based teen safety policies via gpt-oss-safeguard, giving developers a practical way to moderate age-specific risks in AI apps.
OpenAI is using chain-of-thought monitoring to catch misalignment in its internal coding agents before it becomes a real problem. Here’s what they found.
OpenAI’s IH-Challenge trains frontier LLMs to follow trusted instructions first, boosting safety and blocking prompt injection attacks. Here’s what it means.
OpenAI released the GPT-5.4 Thinking system card on March 5, 2026. Here’s what the safety document actually says and why it matters right now.
OpenAI published the GPT-5.4 Thinking system card. Here’s what it says about safety, reasoning limits, and what this model is actually allowed to do.
Google has published a formal statement on the Gavalas lawsuit, arguing Gemini’s mental health safeguards are built alongside medical professionals.
For years, AI models have been treated as black boxes: data goes in, predictions come out, and nobody fully understands what happens in between. That’s changing. MIT Technology Review named mechanistic interpretability a top breakthrough technology for 2026, and the research coming out of Anthropic, OpenAI, and Google DeepMind is revealing how AI models actually…