GPT 5.6 Soul Tests, Fable 5 Return, and More AI News

GPT 5.6 Soul Hands-On: Minecraft, Pokemon, and SpaceX Simulations
GPT 5.6 Family Pricing and Context Window
Fable 5 Restoration: Trump Administration Nears Approval
The Dario Amodei Fear-Mongering Debate
GLM 5.5 Reportedly Matching Mythos in Security
Anthropic Overtakes OpenAI in Enterprise AI Spending
Grok 4.5 Private Beta at SpaceX and Tesla
Model Comparison Table
Weekly AI News Timeline
Summary and Key Takeaways

Introduction

The AI landscape continues to move at extraordinary speed. OpenAI has unexpectedly previewed its GPT 5.6 family with Soul, Terra, and Luna, and early hands-on testing reveals impressive gains in code generation, game development, and simulation. Meanwhile, the Trump administration is nearing approval for Anthropic to restore Fable 5 after weeks of restrictions. Grok 4.5 has entered private beta at SpaceX and Tesla, and GLM developer Zai claims its upcoming model matches Claude Mythos in security vulnerability detection. This article covers all the major developments and what they mean for the competitive landscape.

GPT 5.6 Soul Hands-On: Minecraft, Pokemon, and SpaceX Simulations

Early testing of GPT 5.6 Soul reveals substantial improvements over GPT 5.5 Pro across game generation, simulation, and front-end development. In a Minecraft clone test, Soul generated a complete playable world with a landing page that allows players to tweak options, select game modes, and create a world. The generated environment includes different block types, cloud animations, a day and night cycle, mobs, crafting, a breaking block animation, and varied terrain including desert biomes. The generation took approximately 90 minutes. While the block pickup mechanic did not function correctly and the overall refinement does not match Fable 5, the visual quality and environmental atmosphere are notably improved over previous GPT iterations.

In a head-to-head comparison, GPT 5.6 Soul and GPT 5.5 Pro were both tasked with building a SpaceX Starship booster catch simulation. Soul produced an accurate recreation of the booster catch sequence with the clamps functioning as they do in real SpaceX operations. The output demonstrates strong physical simulation understanding and visual polish.

Perhaps the most impressive demo is a one-shot Pokemon-style RPG generated in just 31 minutes from a short prompt. Due to copyright restrictions, the model could not recreate Pokemon exactly but built an original game inspired by the series. The game includes multiple gyms, eight collectible badges, a starter Pokemon selection, and a Pokemon League endgame system. The generation demonstrates Soul's ability to maintain coherence across a large, complex codebase with multiple interacting systems.

In voxel generation, Soul produced a detailed 3D world in approximately 44 minutes and 30 seconds, substantially outperforming GPT 5.5 Pro on aesthetic quality and 3D coherence. The model demonstrates strong proficiency in working with three-dimensional spaces, appropriate volume profiles, material choices, and lighting.

Across all tests in max reasoning effort mode, the improvements over GPT 5.5 are clearly visible. The increased reasoning budget, moving from approximately 768 to nearly 1,000, allows the model to think longer before finalizing responses. This translates to better output quality across front-end design, simulations, games, full-stack tasks, and general coding. The real standout areas appear to be cybersecurity, hard sciences, and biology, where the model achieves strong results while using fewer tokens than competing approaches.

GPT 5.6 Family Pricing and Context Window

Model	Role	Input (per 1M tokens)	Output (per 1M tokens)
Soul	Flagship reasoning model	$5.00	$30.00
Terra	Balanced everyday model	$2.50	$15.00
Luna	Fast, cost-efficient model	$1.00	$6.00

All three models reportedly come with a massive 1.5 million token context window, which is significantly larger than most competing models. Soul is positioned as the step-function improvement over GPT 5.5 with the same pricing as the previous flagship. Terra delivers performance competitive with GPT 5.5 at roughly two times lower cost, making it a strong option for production workloads that need GPT 5.5-class intelligence at a better price point. Luna targets high-volume tasks where speed and cost efficiency are the primary concerns.

The GPT 5.6 family is currently in limited preview for API and Codex only, available to a small group of US government-approved trusted partners. OpenAI unexpectedly launched the models ahead of earlier predictions, likely influenced by the ongoing US government scrutiny of frontier AI models. Broader availability is expected in the coming weeks, with some Codex users already receiving access quietly. Users can check their backend analytics dashboard to see if they have been granted access.

On benchmarks, OpenAI states that GPT 5.6 Soul sets a new state-of-the-art on Terminal Bench 2.1 for agentic coding workflows, outperforming GPT 5.5, Claude Mythos 5, Fable 5, and Gemini 3.1 Pro. The model also reports major gains in biology and cybersecurity while using fewer tokens. It performs competitively with Mythos on advanced vulnerability research while using roughly one-third of the output tokens. Soul also introduces a new max reasoning mode and an ultra mode that leverages multiple sub-agents to tackle complex tasks.

Fable 5 Restoration: Trump Administration Nears Approval

According to a new report, the Trump administration is reportedly close to allowing Anthropic to restore access to Fable 5 after the model has been offline for roughly two weeks. Insiders familiar with the discussions expect the restriction could be lifted as early as this week. If it happens, it would signal that ongoing negotiations between Anthropic and the US government are finally making progress after weeks of uncertainty.

The restored version will most likely ship with modifications. Industry observers expect stricter safety guardrails, tighter access controls, and potentially reduced capabilities in certain high-risk cybersecurity areas compared to the original release. Even with those limitations, restoring access would be a significant development for users who had integrated Fable 5 into their workflows before the shutdown.

Anthropic released an official statement a few days ago confirming that discussions with the US government are making progress. Since June 12th, the company has been working closely with the government to restore access to Mythos 5 and Fable 5. As a first step, the government has approved a redeployment of Mythos 5 to a select group of US organizations that operate and defend critical infrastructure. Anthropic stated it is continuing to work toward restoring broader access and making Fable 5 available again.

Reports indicate that roughly 100 organizations have already regained access to Mythos 5 and Fable 5, suggesting the Department of Commerce is gradually easing the embargo rather than lifting it all at once. Public access is likely to remain more restricted than before, with higher barriers for access to the most capable versions of the models.

The Dario Amodei Fear-Mongering Debate

A misconception circulating across social media blames US AI companies like Anthropic and OpenAI, particularly Anthropic CEO Dario Amodei, for the government restrictions on frontier models. Some observers claim his public comments about AI risk are responsible for the Fable 5 ban and future restrictions on US models.

This framing oversimplifies the situation. The US government has access to the NSA, intelligence agencies, cybersecurity experts, and its own scientific advisers. Decisions to restrict frontier AI models that represent billions of dollars in value and affect the entire AI race are almost certainly based on the government's own internal assessments rather than on one executive's public statements.

However, Anthropic can be criticized for how it reportedly handled communication with the US government during the early stages of discussions. Multiple reports suggest the company did not cooperate as effectively as officials expected, particularly around responding to national security concerns. That is a separate issue from the claim that Amodei convinced the government to ban AI.

The more likely explanation is that next-generation frontier models have reached a level where they pose meaningful cybersecurity risks if they fall into the wrong hands. The government's primary concerns appear to be preventing adversaries like China from obtaining or reproducing these capabilities through techniques such as model distillation, and preventing these models from being used for offensive cyber operations against US infrastructure. Whether one agrees with that approach is a separate discussion, but the evidence points to national security considerations rather than a reaction to one CEO's remarks.

GLM 5.5 Reportedly Matching Mythos in Security

Zai, the company behind GLM, is reportedly working on a new model that matches Claude Mythos in finding security vulnerabilities. This may be GLM 5.5. If confirmed, this would be a significant development because Mythos has been viewed as one of the strongest models in the world for cybersecurity and long-horizon vulnerability research.

It also demonstrates how intense the AI race has become between US and Chinese labs, with cybersecurity emerging as one of the most important competitive battlegrounds. The key question is whether the model actually matches Mythos in real-world security work or whether this is limited to specific benchmarks. Either way, the development shows that Chinese AI labs are not slowing down in their pursuit of frontier capabilities.

Anthropic Overtakes OpenAI in Enterprise AI Spending

Anthropic reportedly overtook OpenAI in US corporate AI spending by paid transactions at the end of 2025 into early 2026. This marks a significant shift in the enterprise AI landscape. The AI war is no longer just about who has the smartest model. It is about which model becomes embedded in the daily workflows of an organization.

Anthropic gained traction by focusing heavily on software development, which serves as an ideal proving ground because the work is already digital, results are easier to measure, and companies can clearly see improvements in build times, bugs, and review cycles. This mirrors the strategy that made Microsoft Windows and Office the default business environment in the 1990s: alternatives existed, but Microsoft controlled how work actually got done.

This does not mean OpenAI has lost the enterprise. More likely, companies will use different models for different jobs. Anthropic may be preferred for coding, Gemini for multimodal tasks, OpenAI for general productivity, and cheaper models for lower-value tasks. The next phase of the AI race is workflow economics: which system reduces friction, which can be governed effectively, and which delivers enough value to justify the cost?

Grok 4.5 Private Beta at SpaceX and Tesla

Elon Musk announced that Grok 4.5 has entered a private beta phase for both SpaceX and Tesla. According to Musk, the model suggests performance around the level of Claude Opus, though it is unclear whether he is referring to Opus 4.8 or 4.7.

The model is reportedly built on xAI's new 1.5 trillion parameter V9 foundational model, with additional training using Cursor's data to further improve coding capabilities. If Grok 4.5 delivers on these claims, it could become another serious contender in the frontier AI race. The private beta at SpaceX and Tesla suggests xAI is testing the model in demanding, real-world environments before a broader release.

Model Comparison Table

Model	Category	Input Cost (per 1M tok)	Output Cost (per 1M tok)	Context Window	Key Strength
GPT 5.6 Soul	Flagship	$5.00	$30.00	1.5M	Agentic coding, biology, security
GPT 5.6 Terra	Balanced	$2.50	$15.00	1.5M	GPT 5.5-class at 2x lower cost
GPT 5.6 Luna	Fast/Cheap	$1.00	$6.00	1.5M	High-volume, cost-efficient
Claude Fable 5	Frontier	N/A (restricted)	N/A	N/A	Best overall coding
Claude Mythos 5	Frontier	N/A (restricted)	N/A	N/A	Cybersecurity, vulnerability research
Grok 4.5	Frontier (beta)	TBA	TBA	TBA	Claims Opus-level performance

Weekly AI News Timeline

Date	Event
This week	GPT 5.6 Soul, Terra, Luna previewed with limited access
This week	GPT 5.6 Soul hands-on testing: Minecraft clone, Pokemon RPG, SpaceX sim
This week	Trump administration close to approving Fable 5 restoration
This week	Anthropic statement: Mythos 5 redeployed to ~100 orgs
This week	Grok 4.5 enters private beta at SpaceX and Tesla
This week	Zai reportedly developing GLM 5.5 to match Mythos in security
Coming weeks	Broader GPT 5.6 access expected
This week	Fable 5 restriction could be lifted as early as this week

Frequently Asked Questions

When will GPT 5.6 be widely available? Broader access for ChatGPT, Codex, and API users is expected in the coming weeks. Some Codex users are already receiving access. The models are currently in limited preview for US government-approved partners.

Is Fable 5 coming back? Yes. The Trump administration is reportedly close to approving restoration. Anthropic confirmed progress in official statements, and roughly 100 organizations have already regained access to Mythos 5 and Fable 5. The model is expected to return with stricter safety guardrails.

How does GPT 5.6 compare to Fable 5? Early testing suggests GPT 5.6 Soul is behind Fable 5 overall but extremely close. It wins in certain areas, matches in others, and slightly loses in a few. The gap has narrowed significantly from GPT 5.5.

What is the GPT 5.6 context window? All three models reportedly support a 1.5 million token context window.

Is Grok 4.5 publicly available? Not yet. It is in private beta at SpaceX and Tesla. Elon Musk claims it performs around the level of Claude Opus.

Summary and Key Takeaways

GPT 5.6 Soul hands-on testing shows strong gains over GPT 5.5 Pro across game generation, simulation, and front-end development, with a complete Minecraft clone generated in 90 minutes and a Pokemon-style RPG in 31 minutes
All three models (Soul, Terra, Luna) reportedly support a 1.5 million token context window with pricing from $1/$6 to $5/$30 per 1M tokens
Soul sets state-of-the-art on Terminal Bench 2.1 and performs competitively with Mythos on vulnerability research while using one-third of the output tokens
Fable 5 restoration is nearing approval, with roughly 100 organizations already regaining access and broader restoration expected soon with stricter safety measures
The Dario Amodei fear-mongering narrative oversimplifies the situation: government restrictions are based on internal national security assessments, though Anthropic's early communication with officials could have been more effective
GLM 5.5 from Zai reportedly matches Claude Mythos in cybersecurity vulnerability detection, signaling intensifying US-China AI competition
Anthropic overtook OpenAI in US enterprise AI spending by paid transactions as the battle shifts from model quality to workflow economics
Grok 4.5 enters private beta at SpaceX and Tesla, built on a 1.5 trillion parameter V9 model with claims of Opus-level performance