Claude Opus 4.8 prioritises honesty over overconfidence, says Anthropic

3 min learnNew DelhiCould 29, 2026 07:41 AM IST

Massive language fashions (LLMs) are sometimes recognized to make claims they can’t assist. No matter their dimension and prowess, LLMs are susceptible to creating statements with full confidence even when they’re incorrect. Whereas this has been a persistent drawback, AI corporations have been engaged on lowering these situations.

On this path, Frontier AI lab, Anthropic, on Thursday, Could 28, launched its newest mannequin – the Claude Opus 4.8 – which it claims to have made Claude extra trustworthy. The AI startup mentioned that the mannequin is extra trustworthy even with telling the person what they don’t perceive.

An improve to Claude Opus 4.7, the Opus 4.8 is now Anthropic’s strongest usually accessible mannequin. Whereas the enhancements appear incremental, the early testers reported that the mannequin is extra prone to flag uncertainties about its work and fewer prone to make unsupported claims.

The corporate mentioned that the advance was attainable owing to its evaluations that confirmed Opus 4.8 is round 4 occasions much less seemingly than Opus 4.7 to let flaws in code written by it to cross unremarked.

Earlier than launch, Anthropic performed a complete alignment and security analysis of Opus 4.8, the place it discovered that the mannequin carried out higher than the sooner editions. It supported person autonomy and acted in the perfect pursuits of the person. The mannequin additionally confirmed significantly decrease charges of dangerous behaviours, comparable to deception or aiding misuse, when in comparison with Claude Opus 4.7.

Furthermore, its alignment ranges have been reportedly similar to the corporate’s best-aligned mannequin – Claude Mythos Preview, Anthropic’s frontier mannequin that’s so highly effective that the corporate has given its entry to a motley group of trusted companions.

“The evaluation additionally confirmed Opus 4.8 to have charges of misaligned behaviour (comparable to deception or cooperation with misuse) which might be considerably decrease than Opus 4.7 and much like our best-aligned mannequin, Claude Mythos Preview. The complete alignment evaluation, accompanied by a set of pre-deployment security assessments, is reported within the Claude Opus 4.8 System Card,” the corporate mentioned in its weblog.

Story continues under this advert

With regards to benchmarking, Anthropic mentioned that Opus 4.8 achieved the very best rating on its Harvey’s Authorized Agent Benchmark, which evaluates authorized reasoning, changing into the primary mannequin to cross an general 10 per cent on the benchmark. On laptop use and browser brokers, the mannequin reportedly secured 84 per cent on On-line-Mind2Web. The mannequin demonstrated enhancements in enterprise work and agentic reasoning.

Anthropic emphasised decreased unsupported claims and improved uncertainty reporting. These are the scores shared by the corporate; nevertheless, an intensive evaluation by third-party testers might provide extra goal outcomes.

Source link

Claude Opus 4.8 prioritises honesty over overconfidence, says Anthropic | Technology News

Pixel 10’s Best Feature Set For Key Upgrade – But You’ll Have To Wait

Sony hints at next PlayStation with possible handheld focus | Technology News

‘I’d like little more honesty’: Naseer Hussain tells England cricket think tank | Cricket News

Pixel 10a Even Cheaper Than Amazon Prime Day Deal

Shashank Singh reacts after FIR over domestic help assault: ‘Caught stealing’

Trump Cabinet Official Offers Most Undiplomatic Commentary On Iran World Cup Exit

Sharon Stone Sees ‘Incredibly Bright White’ Light in Near-Death Experience

Pixel 10’s Best Feature Set For Key Upgrade – But You’ll Have To Wait

Prateek Kuhad announces India leg of his The Way That Lovers Do world tour

Insect protein slows weight gain, boosts health status in obese mice

Dollar Rallies as Middle East Tensions Intensify and Stocks Fall

Claude Opus 4.8 prioritises honesty over overconfidence, says Anthropic | Technology News

Related Posts