Back to NewsArtificial Intelligence

Fable 5 Return, GPT 5.6 Delay, and AI Chip News

6/26/202620 min read

Fable 5 Return, GPT 5.6 Delay, and AI Chip News


Introduction

This week has been one of the most eventful periods in recent AI history. Claude Fable 5 is poised to return after a two-week shutdown, Anthropic has accused Alibaba of orchestrating the largest AI model theft campaign on record, OpenAI unveiled its first custom AI chip, and Google DeepMind is facing both talent attrition and disappointing benchmark results from its next generation model. Each of these stories carries significant implications for the competitive landscape, and together they paint a picture of an industry moving at unprecedented speed.

For developers and teams building on these platforms, the developments this week affect everything from which models are available to how much they cost and what legal risks may surround their use. Understanding the full context behind each story is essential for making informed decisions about AI tooling and infrastructure investments.

Claude Fable 5 Nears Return

Claude Fable 5, which was taken offline approximately two weeks ago, is now showing strong signs of an imminent return. The model has been spotted inside Amazon Bedrock's model catalog as well as in the latest Claude Code version 2.19 update. Several new string changes in the 2.19 release strongly hint that Anthropic is preparing for Fable 5 to come back online, and the evidence is mounting from multiple independent sources.

One of the most telling changes is a new message added to Claude Code: "You've used your Fable 5 usage for this week." This is a major signal because it suggests Fable 5 may no longer be treated as a separate paid add-on, which was the original business model, but could instead be included directly inside Claude subscriptions with weekly usage limits. This would represent a significant shift in Anthropic's pricing strategy and could make the model accessible to a much wider user base than before.

The model was also reportedly visible inside Claude's chatbot interface, the iOS app, and Amazon Bedrock's model catalog with an option to open it in Playground mode. For developers using AWS infrastructure, the Bedrock listing is particularly meaningful because AWS typically does not list models that are not at least being prepared for deployment.

PolyMarket odds for Fable 5 returning before July 31st have surged from 45% to over 90% following these client-side discoveries. Prediction markets aggregating bettor sentiment are often more responsive to concrete technical signals than to rumors, and the sharp movement suggests market participants found the evidence credible. Multiple Claude Code team members have also posted about the model's impending return on social media, further fueling confidence.

Anthropic's head of growth did respond to the speculation, stating that the company can "categorically confirm that Fable 5 and Mythos were not currently serving any live traffic" and that the team was investigating whether sightings were a front-end UI bug based on historical data or simply people trolling online. However, official statements from growth-oriented roles at AI companies often serve dual purposes: managing expectations while avoiding premature confirmation that could affect regulatory discussions or partner negotiations. The presence of Fable 5 in AWS listings and production code changes strongly suggests Anthropic is actively preparing for a return regardless of whether traffic is currently live.

Why Fable 5 Was Taken Down: Project Glass

The reason for Fable 5's removal has now been clarified with authoritative reporting from multiple news outlets. According to Reuters and the Associated Press, the US government restricted the model after Claude Mythos identified vulnerabilities in highly sensitive US government computer systems during a controlled testing exercise with a Washington-based intelligence agency. These tests reportedly took place under Project Glass, a restricted program designed to find and fix vulnerabilities in critical software before real attackers can exploit them.

The implications of this reporting are significant. It means the model was not removed due to a safety failure or alignment problem in the traditional sense, but rather because it was almost too effective at its assigned task. The model was doing exactly what it was asked to do, and it did it so thoroughly that the results raised national security concerns about the potential for misuse.

Senator Mark Warner referenced this during a congressional hearing, stating that he was told by NSA Chief Joshua Rudd that "Mythos broke into almost all classified systems not in weeks but in hours." This timeline is what makes the story so remarkable. Traditional penetration testing by human teams can take weeks or months to identify vulnerabilities across a large attack surface. Mythos accomplished in hours what would take human teams an entire engagement cycle to achieve.

This was part of a controlled security test rather than an autonomous rogue AI attack, which puts the subsequent model restrictions into clearer context. The US government did not ban the model because it was dangerous in an uncontrolled sense, but because its capabilities in this specific domain were so advanced that the government needed time to assess the full implications before allowing continued public access.

Political Dynamics Behind the Fable 5 Restoration

According to a Wired report, the Trump administration has become more receptive to discussions with Anthropic's team after CEO Dario was replaced in White House meetings by co-founder Tom Brown. This personnel change appears to have shifted the tenor of negotiations. The report suggests conversations have been going quite smoothly since the change, which combined with the Fable 5 sightings in Bedrock and Claude Code, points to a resolution being reached soon.

The political dimension adds another layer of complexity. Fable 5's return is not purely a technical decision at Anthropic, it involves coordination with US government agencies that have a direct interest in how frontier AI capabilities are managed. The fact that prediction markets now place the return probability at 90% by July 31st suggests that market participants believe these discussions are approaching a conclusion.

For developers and enterprises that had integrated Fable 5 into their workflows and were forced to pivot when it was taken offline, the prospect of its return with a new pricing model (weekly usage limits included in subscriptions) would be welcome news. The key question will be whether Anthropic imposes stricter usage caps or monitoring on the restored model compared to its initial release.

Anthropic Accuses Alibaba of Mass Model Theft

Anthropic is now accusing Alibaba-linked operators of carrying out what it describes as potentially the largest AI theft campaign ever documented. According to Bloomberg, Anthropic claims these operators created nearly 25,000 fraudulent accounts used to generate approximately 28.8 million Claude exchanges between April and June. The scale of this operation is unprecedented in the AI industry.

The goal was allegedly to extract Claude's capabilities without paying for proper API access, particularly in areas like software engineering, agentic reasoning, and advanced coding. This type of attack is known as model distillation: using frontier model outputs to train a competing model at a fraction of the cost. Instead of spending hundreds of millions on training data and compute, a bad actor can simply query a frontier model millions of times and use the outputs to train a cheaper replica.

Anthropic's filing names not just Alibaba but also DeepSeek, Moonshot AI, and Minimax as part of a broader pattern of Chinese AI labs attempting to harvest capabilities from US frontier models. The breadth of the accusations suggests Anthropic believes this is a systemic issue rather than a one-off incident. The company has sent letters to the Senate and White House calling for urgent action, arguing that this has moved beyond model competition into industrial-scale AI espionage.

For the broader AI ecosystem, this story raises uncomfortable questions about API security, rate limiting, and account verification. If a determined actor can create 25,000 accounts and generate nearly 29 million exchanges over three months without being detected earlier, then current safeguards are clearly insufficient. Expect significant changes to how AI companies handle API access and account verification in response.

Google DeepMind Struggles: Talent Exodus and Gemini Delays

Google DeepMind is experiencing a difficult period that raises questions about its ability to maintain its position as a leading AI research organization. According to Business Insider, Gemini 3.5 Pro has been delayed to July, with new checkpoints currently being tested in the Arena and within Google's own chatbot. Early testing results are not encouraging, and the signals from the testing community suggest deeper problems.

Some users are seeing two models under the same Gemini 3.1 Pro preview name in Arena battle mode. One appears to be the older Gemini 3.1 Pro checkpoint while the other may be a newer Gemini 3.5 Pro checkpoint being tested in disguise. In an SVG generation test where the prompt asked for a BMW M4 CS, the newer checkpoint reportedly performed worse than the older Gemini 3.1 Pro version. For a next-generation model to underperform its predecessor on a relatively standard creative task is an unusual and concerning signal.

Even more concerning, Gemini 3.5 Pro reportedly may not have fresh knowledge of 2025 or 2026 events. If a model launching in mid-2026 lacks awareness of events from 2025, that suggests either a training data cut that predates its release by a year or more, or training pipeline issues that prevented more recent data from being incorporated. Either explanation is problematic for a model positioning itself as a current-generation offering.

Adding to Google's challenges, the company is losing key AI talent at an accelerating rate. According to Bloomberg, Jonas and Alexander, two key contributors to the Gemini models, are leaving to join Anthropic. These departures follow several other high-profile exits in recent weeks, placing increasing pressure on Google as OpenAI and Anthropic continue to attract top researchers from DeepMind. The researchers leaving were part of the initial Gemini team and were instrumental in the early success of those models. Their departure means Google is not just losing individual contributors, but the institutional knowledge and architecture expertise that made Gemini competitive in the first place.

IssueImpact
Gemini 3.5 Pro delayed to JulyProduct timeline slippage
Newer checkpoints underperform older onesRegression in quality
No fresh 2025/2026 knowledgeTraining data pipeline issues
Jonas and Alexander leave for AnthropicLoss of key Gemini architects
Multiple high-profile exits in weeksErosion of DeepMind talent base

GPT 5.5 Instant Update and GPT 5.6 Delay

OpenAI has released an update to GPT 5.5 Instant, describing it as "much more fun to talk to." The update improves intent understanding, response adaptability, and the handling of complex constraints. It also makes shopping and local recommendations feel more useful and cohesive. The update rolled out first to paid users with free users receiving access the following day.

The emphasis on making the model more conversational and enjoyable is notable because it signals that OpenAI is paying attention to the qualitative feel of interactions, not just benchmark scores. Improvements to shopping and local recommendations also suggest OpenAI is positioning ChatGPT as a practical daily tool for commerce and discovery, not just a productivity assistant for developers and knowledge workers.

Regarding GPT 5.6, the model has been delayed. While earlier reports suggested a June 25th launch, the public release is now expected in the second week of July. GPT 5.6 has started rolling out to OpenAI Enterprise Partners for testing ahead of the wider launch, following the same pattern OpenAI used with GPT 5.5 where enterprise preview preceded public availability by about two weeks.

The update reportedly brings a new maximum reasoning effort for the GPT 5.6 series. Pricing is expected to remain the same as GPT 5.5, which is positive news for users concerned about cost increases. However, GPT 5.6 may be less token efficient than GPT 5.5, potentially using more tokens for equivalent tasks. This trade-off between reasoning depth and token consumption means users may need to evaluate whether the improved quality justifies the higher per-task cost.

GPT 5.6 thinking mode has also appeared for enterprise users in ChatGPT. This follows the same pattern as the GPT 5.5 rollout where enterprise preview preceded public launch, suggesting OpenAI is following a proven deployment playbook. The enterprise testing phase allows OpenAI to gather real-world performance data, identify edge cases, and make final adjustments before the broader release.

OpenAI Unveils Jalapeno Custom AI Chip

OpenAI has unveiled Jalapeno, its first custom AI chip designed from scratch for large language model inference. This marks OpenAI's move deeper into the full AI stack, encompassing chips, kernels, memory, networking, racks, scheduling, deployment, and the final product experience. This vertical integration strategy mirrors what Apple and Tesla have done in their respective industries: controlling the full stack to optimize for their specific workloads.

The chip was built with Broadcom and Celestica and is optimized around the workloads OpenAI runs across ChatGPT, Codex, its API platform, and future agentic products. Early samples are already running ML workloads in the lab at target frequency and power. OpenAI claims performance per watt should be substantially better than current state-of-the-art hardware, positioning it as a direct competitor to NVIDIA in certain use cases. For OpenAI, which runs one of the largest AI inference workloads in the world, even modest per-watt improvements translate into enormous cost savings at scale.

Perhaps most striking is the development timeline: Jalapeno went from initial design to manufacturing tape-out in just nine months, which OpenAI believes is one of the fastest advanced ASIC deployment cycles ever. To put that in perspective, custom chip development typically takes 18 to 36 months. A nine-month cycle from design to tape-out is exceptional by any standard.

ChatGPT itself helped engineers design parts of the chip, demonstrating how AI can contribute to better chip design and potentially lower compute costs across the industry. This creates a virtuous cycle: better AI helps design better chips, which run AI more efficiently, which enables better AI. For the broader industry, the implication is that AI-assisted chip design could accelerate hardware innovation cycles across the board.

The strategic logic behind Jalapeno is clear: less dependence on external GPUs, more control over compute economics, and a stronger flywheel between models, products, revenue, and infrastructure. Deployment is planned to begin by the end of 2026.

Chip DetailInformation
NameJalapeno
PurposeLLM inference (ChatGPT, Codex, API, agents)
PartnersBroadcom, Celestica
TimelineDesign to tape-out in 9 months
StatusEarly samples running at target frequency/power
Deployment targetEnd of 2026
Strategic goalReduce NVIDIA dependency, control compute costs

Claude Tag, Cursor Leaderboards, and OCR4

Anthropic introduced Claude Tag, a new way for teams to work with Claude directly inside Slack. Claude joins a workspace almost like another team member. Admins can choose which channels and tools Claude can access, and team members can tag Claude in a thread to delegate tasks. This eliminates the need to constantly switch between applications, allowing teams to bring Claude directly into their conversation environment for context summarization, task follow-up, and tool-connected work.

The Slack integration is a smart move for Anthropic because it positions Claude as a collaborative tool embedded in existing workflows rather than a separate application that requires context switching. For teams already living in Slack, having Claude available via @mention means AI assistance becomes as natural as asking a colleague for help. The admin-controlled access model also addresses enterprise concerns about data governance and appropriate use.

Cursor added a new team leaderboard for plugins, skills, and MCPs. Teams can now see which tools are being used most across their workspace and add them to their own setup with one click from a customized page. This is a practical update for teams trying to standardize their AI coding workflow. The leaderboard creates social proof effects that can help teams discover useful tools they might not have tried otherwise, while the one-click installation reduces friction for adopting team-wide standards.

Keshoc introduced OCR4, a model that extracts structured documents with bounding boxes, block classification, inline confidence scores, and support for over 170 languages. In one demonstration, OCR4 converted a handwritten calculus exam photo into clean LaTeX in 5.1 seconds for approximately 9 cents. The formulas were converted correctly, which is typically the hardest part of OCR because mathematical notation is highly structured and context-dependent.

The model does not redraw graphs but detects them, boxes them, and tags them as charts rather than silently ignoring them. This is a subtle but important design choice: rather than producing an incorrect or garbled output for a component it cannot handle, the model explicitly marks what it has identified, preserving document structure even where full content extraction is not possible. This makes OCR4 useful for understanding the full structure of messy documents, not just reading text.

Ornith 1.0 Open Source Coding Models

A new developer has released Ornith 1.0, a family of open source large language models focused on coding. The lineup includes a 9 billion dense model for lightweight local deployment, a 31 billion dense model for balanced performance, a 35x mixture of experts model, and a 397 billion model for maximum capability. All sizes are available under the MIT license, making them free for commercial and research use.

Strong benchmark results are claimed across Terminal Bench, SWE-Bench, SWE-Atlas, and other coding evaluations. The models rank highly compared to Qwen 3.7 Max and outperform DeepSeek V4 Pro in several benchmarks, coming close to Opus 4.7 on certain evaluations. For teams that have been priced out of using frontier models at scale, an MIT-licensed model approaching Opus 4.7 performance on coding tasks could be a game-changer.

The models were post-trained on Gemma 4 and 3.5 using a self-improving reinforcement learning strategy that improves both coding solutions and the scaffold around the task. This means the training process optimizes not just the code output but the entire problem-solving approach, including how the model structures its reasoning, handles edge cases, and manages iterative refinement.

The release under MIT license is significant because it removes the most common barrier to adoption for open source models in commercial settings. Unlike models with restrictive licenses that limit use cases or require royalties, MIT-licensed models can be freely integrated into commercial products, fine-tuned on proprietary data, and deployed at any scale without licensing concerns.

Demis Hassabis on the Meaning of AGI

Google DeepMind co-founder Demis Hassabis recently shared his perspective on artificial general intelligence. According to Hassabis, AGI has always meant one core idea: a system that can learn from anything and output it in any format, similar to how a human mind works. His point is that modern civilization was built by hunter-gatherer brains, which demonstrates what general intelligence can achieve when it is flexible enough.

Hassabis noted that for decades, AI was mostly hard-coded rule-based systems, but the field now finally feels like it is catching up to the original vision of general intelligence. He stated that AGI is something that will be coming fairly soon, aligning with the broader industry consensus that the transition from narrow AI systems to more general capabilities is accelerating.

The framing is worth paying attention to because Hassabis is one of the few people in the industry who has been working toward AGI as an explicit goal for over a decade. His definition emphasizes flexibility and adaptability rather than benchmark performance or specific capabilities. A system that can learn from anything and produce output in any format is a fundamentally different concept from a system that excels at coding or reasoning but cannot adapt to novel domains.

Weekly AI News Timeline

DateEvent
This weekGPT 5.5 Instant update released to paid users
This weekJalapeno custom AI chip unveiled by OpenAI
This weekClaude Tag for Slack launched by Anthropic
This weekOrnith 1.0 open source models released
This weekOCR4 launched by Keshoc
Late JuneFable 5 spotted in Amazon Bedrock and Claude Code v2.19
Late JuneAnthropic accuses Alibaba of 28.8M-query distillation attack
Early JulyGPT 5.6 expected public release (delayed from June 25)
JulyGemini 3.5 Pro delayed to July with weak checkpoint results
By July 31Fable 5 return odds at 90%+ on PolyMarket
End of 2026Jalapeno chip deployment planned

Frequently Asked Questions

When will Claude Fable 5 be available again? All signs point to a return before July 31st, with PolyMarket odds at 90%. The model has been spotted in Amazon Bedrock and Claude Code v2.19, and political discussions between Anthropic and the US government are reportedly progressing smoothly.

Why was Fable 5 taken down? The model was restricted after Mythos identified vulnerabilities in sensitive US government systems during Project Glass testing. The model broke into classified networks in hours during controlled security exercises, raising national security concerns.

When is GPT 5.6 launching? GPT 5.6 has been delayed from its original June 25th target to the second week of July. Enterprise partners already have access for testing.

What is the Jalapeno chip? Jalapeno is OpenAI's first custom AI chip for LLM inference, built with Broadcom and Celestia. It went from design to tape-out in 9 months and is planned for deployment by end of 2026.

Is Ornith 1.0 free to use? Yes. All model sizes are released under the MIT license, free for commercial and research use.

Summary and Key Takeaways

  • Claude Fable 5 expected to return within weeks: sightings in Amazon Bedrock and Claude Code v2.19, with PolyMarket odds at 90% for a July return. New usage-limit strings suggest it may be included in subscriptions rather than as a paid add-on
  • Fable 5 was restricted after Mythos broke into classified US government systems in hours during Project Glass testing. The Trump administration is reportedly engaged in productive talks with Anthropic for restoration
  • Anthropic accuses Alibaba of creating 25,000 fraudulent accounts to generate 28.8 million Claude queries for model distillation, naming DeepSeek, Moonshot AI, and Minimax as part of a broader pattern
  • Google DeepMind losing top talent (Jonas and Alexander, key Gemini contributors, joining Anthropic) while Gemini 3.5 Pro checkpoints underperform older versions and the launch is delayed
  • GPT 5.5 Instant received a conversational update; GPT 5.6 delayed to July with enterprise previews underway, pricing unchanged but potentially less token efficient
  • OpenAI's Jalapeno chip went from design to tape-out in 9 months using ChatGPT-assisted design, targets end of 2026 deployment as an NVIDIA competitor
  • Claude Tag for Slack, Cursor team leaderboards, OCR4, and Ornith 1.0 (MIT-licensed coding models approaching Opus 4.7 on benchmarks) round out the week's announcements
  • Demis Hassabis shared his AGI definition: a system that can learn from anything and output in any format, stating it will arrive fairly soon
Written by OutGrave Team