Close Menu
  • Homepage
  • Local News
  • India
  • World
  • Politics
  • Sports
  • Finance
  • Entertainment
  • Business
  • Technology
  • Health
  • Lifestyle
Facebook X (Twitter) Instagram
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
Facebook X (Twitter) Instagram Pinterest
JHB NewsJHB News
  • Local
  • India
  • World
  • Politics
  • Sports
  • Finance
  • Entertainment
Let’s Fight Corruption
JHB NewsJHB News
Home»Technology»Claude Opus 4.8 prioritises honesty over overconfidence, says Anthropic | Technology News
Technology

Claude Opus 4.8 prioritises honesty over overconfidence, says Anthropic | Technology News

May 29, 2026No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Claude Opus 4.8 combines incremental capability upgrades with stronger alignment. (Image: Anthropic)
Share
Facebook Twitter LinkedIn Pinterest Email

3 min learnNew DelhiCould 29, 2026 07:41 AM IST

Massive language fashions (LLMs) are sometimes recognized to make claims they can’t assist. No matter their dimension and prowess, LLMs are susceptible to creating statements with full confidence even when they’re incorrect. Whereas this has been a persistent drawback, AI corporations have been engaged on lowering these situations.

On this path, Frontier AI lab, Anthropic, on Thursday, Could 28, launched its newest mannequin – the Claude Opus 4.8 – which it claims to have made Claude extra trustworthy. The AI startup mentioned that the mannequin is extra trustworthy even with telling the person what they don’t perceive.

An improve to Claude Opus 4.7, the Opus 4.8 is now Anthropic’s strongest usually accessible mannequin. Whereas the enhancements appear incremental, the early testers reported that the mannequin is extra prone to flag uncertainties about its work and fewer prone to make unsupported claims.

The corporate mentioned that the advance was attainable owing to its evaluations that confirmed Opus 4.8 is round 4 occasions much less seemingly than Opus 4.7 to let flaws in code written by it to cross unremarked.

Earlier than launch, Anthropic performed a complete alignment and security analysis of Opus 4.8, the place it discovered that the mannequin carried out higher than the sooner editions. It supported person autonomy and acted in the perfect pursuits of the person. The mannequin additionally confirmed significantly decrease charges of dangerous behaviours, comparable to deception or aiding misuse, when in comparison with Claude Opus 4.7.

Furthermore, its alignment ranges have been reportedly similar to the corporate’s best-aligned mannequin – Claude Mythos Preview, Anthropic’s frontier mannequin that’s so highly effective that the corporate has given its entry to a motley group of trusted companions.

“The evaluation additionally confirmed Opus 4.8 to have charges of misaligned behaviour (comparable to deception or cooperation with misuse) which might be considerably decrease than Opus 4.7 and much like our best-aligned mannequin, Claude Mythos Preview. The complete alignment evaluation, accompanied by a set of pre-deployment security assessments, is reported within the Claude Opus 4.8 System Card,” the corporate mentioned in its weblog.

Story continues under this advert

With regards to benchmarking, Anthropic mentioned that Opus 4.8 achieved the very best rating on its Harvey’s Authorized Agent Benchmark, which evaluates authorized reasoning, changing into the primary mannequin to cross an general 10 per cent on the benchmark. On laptop use and browser brokers, the mannequin reportedly secured 84 per cent on On-line-Mind2Web. The mannequin demonstrated enhancements in enterprise work and agentic reasoning.

Anthropic emphasised decreased unsupported claims and improved uncertainty reporting. These are the scores shared by the corporate; nevertheless, an intensive evaluation by third-party testers might provide extra goal outcomes.



Source link

Anthropic Claude Honesty news Opus overconfidence prioritises Technology
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

One UI 8.5 Unlikely to Come to Galaxy S22, Fold 4 and More

June 10, 2026

OnePlus N Series Budget Phones Tipped For India

June 10, 2026

Snowflake expands AI offerings with new development and governance tools | Technology News

June 10, 2026

iPhone 18 Pro Samples Show New Colours – One is a Clear Winner

June 10, 2026
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Argentina look like 2022 again. Can Lionel Messi and the old guard win?

June 10, 2026

Jimmy Kimmel Nails Biggest Flaw In Trump’s Conspiracy Theory

June 10, 2026

King Charles Evicts Beatrice And Eugenie From Royal Homes

June 10, 2026

One UI 8.5 Unlikely to Come to Galaxy S22, Fold 4 and More

June 10, 2026
Popular Post

An Alarm Clock is my Favourite Piece of Tech From Last Few Years

Siberian Tiger Travels 200 Km Across Russian Forest To Reunite With Former Mate

Trump White House’s ‘Golden Age’ POV Clip Triggers Massive Cringe Fest Online

Subscribe to Updates

Get the latest news from JHB News about Bangalore, Worlds, Entertainment and more.

JHB News
Facebook X (Twitter) Instagram Pinterest
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
© 2026 Jhb.news - All rights reserved.

Type above and press Enter to search. Press Esc to cancel.