The Biggest AI Launches of May 2026: A Creator's Recap

May 2026 was the busiest month yet for AI launches. Google held its I/O keynote on May 19 and shipped Gemini 3.5 Flash as the new default in the Gemini app - a Flash-tier model that outperforms its own Pro flagship on coding and agentic benchmarks at 289 tokens per second, roughly four times faster than other frontier models. That was the headline, but it was nowhere near the only release. Inside a 30-day window the industry shipped a new ChatGPT default, a Suno V5 sequel, ElevenLabs' first standalone music app, a 1.6T-parameter open-weight model from DeepSeek, a Grok 4.3 beta from a newly SpaceX-owned xAI, fresh Veo 3.1 editing inside Flow, and a wave of agentic browsers that now collectively account for roughly half of all measurable agent traffic on the web. Eight major events in seventeen days, by one count from Artificial Analysis.

If you make content for a living, the question is no longer "is there a model that can do this?" The question is "which of the six new ones that shipped this month should I actually learn?" This recap is the answer - organized by category, with the source links so you can verify anything that matters, and a creator-first read on which launches change your workflow versus which are noise.

Open laptop and notebook on a creator's workspace with news on screen

LLMs: a three-way race that nobody is winning outright

The frontier LLM race got tighter in May, not looser. Three releases stand out.

Gemini 3.5 Flash (May 19, 2026). Google's I/O keynote made Gemini 3.5 Flash the default model in the Gemini app and AI Mode in Search globally. The headline benchmark: it outperforms Gemini 3.1 Pro - Google's own flagship that only launched in February - on coding and agentic tasks while running at roughly 4x the speed. The strategic message Google sent was equally important: the future of AI is autonomous agents that finish multi-step jobs, not faster chat completions. Google also previewed Gemini 3.5 Pro for a June release. Why it matters for creators: cheaper, faster reasoning is the substrate everything else runs on. Brand-voice drafting, batch caption generation, agentic research - all of it gets noticeably snappier on 3.5 Flash.

GPT-5.5 Instant (May 5, 2026). OpenAI rolled GPT-5.5 Instant out to free-tier ChatGPT users on May 5, replacing GPT-5.3 Instant as the default model for everyone. According to TechCrunch's coverage, internal evals show GPT-5.5 Instant produces 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts in medicine, law, and finance - and notably ships with fewer gratuitous emojis. The full GPT-5.5 model had launched two weeks earlier on April 23 for paid tiers, but May was when it landed in the lap of the broadest user base. GPT-5.5 sits at 59 on the Artificial Analysis Intelligence Index, two points ahead of Gemini 3.1 Pro.

Claude Opus 4.7 (April 16, 2026, but May was the integration month). Anthropic's Opus 4.7 shipped a 10.9-point jump on SWE-bench Pro versus Opus 4.6 - the biggest single-version improvement of any model this year. Pricing held steady at $5 per million input tokens and $25 per million output. May was when Opus 4.7 fully rolled across Amazon Bedrock, Vertex AI, and Microsoft Foundry, and when GitHub Copilot defaulted creators on long-form work over to it.

The supporting cast in May: xAI's Grok 4.3 Beta keeps pushing on price with $1.25 per million input tokens, native video input, and a 1M-token context, while Mistral pushed Medium 3.5 - a 128B dense open-weight model under MIT license. Nobody won the month outright. Opus 4.7 still leads on complex coding, GPT-5.5 on agentic terminal work, Gemini 3.5 Flash on speed-per-dollar, and Grok on raw cost. For a deeper head-to-head on what each one feels like in creator workflows, see ChatGPT vs Claude vs Gemini for creators in 2026.

AI video: Veo 3.1 evolves, Sora gets buried, Kling holds the value crown

The video category had the most theatrical month of any segment.

Veo 3.1 plus Flow updates (mid-May 2026). Google brought a significant Veo 3.1 update into Flow, the company's creative studio. The big additions: audio is now generated across "Ingredients to Video," "Frames to Video," and "Extend" features for the first time, plus minute-long multi-shot clips that maintain scene consistency through cinematic transitions. Google also previewed reusable AI character templates - save a generated character once and reference it across projects with @CharacterName shortcuts. This is the feature creators have asked Google for since Veo 3 launched. For background on Veo 3.1's full capabilities, our Veo 3.1 complete guide covers prompting, pricing, and where it slots in your stack.

Sora 2 effectively shut down (April 26, 2026 deprecation - May aftermath). This was the surprise of the season. OpenAI deprecated the Sora web and app experiences on April 26, with the API scheduled for shutdown September 24, 2026, per IndieWire. The Disney content licensing deal was also discontinued. The reasoning was strategic - OpenAI is folding video into ChatGPT and a yet-to-be-named successor - but for creators with active Sora workflows, May meant migrating off. Most went to Veo 3.1 or Kling 3.0. For our exit guide and migration paths, see Sora 2 alternatives - best AI video tools 2026.

Kling 3.0 holds the value crown. Kling 3.0, released in February, didn't ship a new version in May, but it spent May taking market share. Its Multi-Shot Storyboard and native lip-sync features kept it the price-to-quality champion across all the May aggregator comparisons we tracked. VO3 AI's March benchmarks put Kling within striking distance of Veo on motion realism at roughly half the per-clip cost. For the full three-way comparison, see Sora 2 vs Veo 3.1 vs Kling 3 in 2026.

The honorable mentions: Runway's Gen-4 Aleph keeps quietly leading on video-to-video editing (add and remove objects, change lighting, generate alternate angles from one shot), and aggregator hubs like Higgsfield are turning multi-model access into a commodity.

Futuristic neural network visualization with glowing nodes

AI image: Nano Banana Pro is the real story

The image category had a quiet but decisive May - one model dragged it forward.

Nano Banana Pro / Gemini 3 Pro Image. Google's Nano Banana Pro is the rebrand of Gemini 3 Pro Image - the state-of-the-art image generation and editing model with 4K output, advanced creative controls, and infographic-and-diagram-friendly text rendering. The follow-up, Nano Banana 2 (Gemini 3.1 Flash Image), combines Pro's quality with lightning-fast speed. In May, Google extended Nano Banana into Google Ads, Workspace, and the new Gemini Personal Intelligence layer, with SynthID watermarks across every output. Creators who used the text-to-image tool at any major platform this month saw Nano Banana 2 quietly become the default backend for fast iteration.

Midjourney V7 and Flux Pro Ultra hold their lanes. No new versions from either in May, but the 2026 image comparison landscape consolidated: Midjourney V7 owns editorial and cinematic, Flux Pro Ultra owns API-accessible photorealism, Ideogram V3 owns text-in-image and product packaging. The picture for creators is finally stable enough to pick a primary and a secondary without regret. Our Flux 1.2 Ultra vs Midjourney V7 deep dive walks the trade-offs.

Imagen 4 quietly powers more of Google's surfaces. Imagen 4 didn't get a numbered version bump in May, but it picked up integration breadth - the model now sits behind several Workspace surfaces alongside Nano Banana, with a clear "quality-first vs editing-first" split.

AI audio: Suno V5 vs ElevenMusic is a war that started in May

If you make music, podcasts, or any video with a soundtrack, May was the most consequential month in the category's history.

Suno V5 keeps the throne. Suno's V5 release earlier in 2026 stayed dominant in May, with its 44.1kHz CD-quality audio and ELO score of 1,293 holding it ahead of every competitor on vocals, fidelity, and musical structure. Suno is still the broadest user base in AI music with an estimated 15 million MAUs.

ElevenMusic launched (April 1, 2026; May was the consumer push). ElevenLabs released ElevenMusic as a standalone iOS app - text-in, full-song-out - with the differentiator that ElevenLabs trained on licensed data from day one (Merlin Network and Kobalt Music Group deals before launch). For commercial creators, that licensing posture matters more than ELO scores. May was when the app cleared the App Store editorial cycle and started showing up on creator timelines. The strategic move it completes: ElevenLabs now owns voice synthesis, music generation, and sound effects under one subscription and one API.

What this means in practice: Suno still wins blind listening tests, but ElevenMusic wins commercial-safety arguments. Brand creators who can't risk a licensing dispute have a real option for the first time. Both render through Versely's music generator integration; experiment with the same brief in both and pick by ear.

Open source: the year DeepSeek caught up

The open-weight side had its loudest month of the year.

DeepSeek V4 Preview (April 24, 2026 release, May rollout). DeepSeek V4 shipped in two MoE variants: V4-Pro at 1.6T total parameters with 49B activated and a 1M-token context, and V4-Flash at 284B / 13B active. Both are MIT licensed - more permissive than Llama 4's custom community license. The headline number: V4-Pro hits 80.6% on SWE-bench Verified, within 0.2 points of Claude Opus 4.6, under MIT. May was when the inference providers (Together, Fireworks, Runpod) finished spinning up production hosting. For creators, DeepSeek V4-Flash is now a viable backend for high-volume captioning, classification, and chat use cases at roughly one-sixth the cost of closed frontier models.

Qwen 3.5 keeps gaining. Alibaba's Qwen 3.5, released February 16, kept extending its open-weight lead in May. The 397B-A17B flagship is now the third-best open-source model in the world by independent benchmarks - comparable to GPT-5 and Claude Opus on several axes - and the entire family is Apache 2.0 with no commercial restrictions. The smaller Qwen3.5-9B scores 70.1 on MMMU-Pro visual reasoning, 22.5% higher than OpenAI's GPT-5-Nano. Quietly, Qwen has become the default "I want a powerful model that runs on my hardware" pick.

Llama 4 Behemoth still not shipped. Worth noting because creators keep asking: as of May 2026, Llama 4 Behemoth remains in training. Scout and Maverick are the only Llama 4 variants you can actually download. If you need open-weight at frontier capability today, DeepSeek V4-Pro and Qwen3.5-397B-A17B are the answers.

Code on a developer's monitor with abstract data visualizations

Agentic AI and browsers: the under-the-radar shift

The category nobody talks about loudly is the one that quietly reshaped the web in May.

The agentic browser wars are on. Per Human Security's April State of Agentic Traffic report, Perplexity Comet now drives 48.12% of measurable agentic web requests, ahead of ChatGPT Atlas (21.33%), the Claude Chrome Extension (17.33%), and ChatGPT Agent (8.55%). Comet finished its iOS rollout in March, Atlas continues shipping to all ChatGPT subscribers, and Anthropic - fresh off the February 2026 Vercept acquisition - pushed Claude Sonnet 4.6 to a 72.5% OSWorld score, roughly the ceiling of human performance on that benchmark. May was when "the browser is the agent" stopped being a thesis and started being measurable bandwidth.

Operator and ChatGPT Agent caught up partially. Operator is still trailing on OSWorld (38.1%), but monthly improvements are visible. For creators, what matters is that agents can now reliably do the boring work: schedule posts, pull analytics, fill out forms, monitor brand mentions. The question for the rest of 2026 is which agent surface you trust enough to give your credentials to.

Comparison charts and analytics on a glowing dashboard screen

Winners and losers: a brutal one-paragraph read

Winners. Google had the cleanest month: Gemini 3.5 Flash leapfrogged its own Pro, Veo 3.1 inside Flow finally has the editing tools creators asked for, and Nano Banana extended into every Workspace surface. DeepSeek won the open-source narrative. ElevenLabs converted music from a side project into a serious revenue line. Perplexity won the agentic browser usage chart. Losers. OpenAI's Sora shutdown was the most awkward product retreat of the year - they buried a model people loved without naming what replaces it. Meta keeps not shipping Behemoth, which makes Llama 4 look frozen against Qwen and DeepSeek. Runway didn't ship a Gen-5; Aleph is great but the silence around what's next is starting to read as a pause.

What creators should actually try first

Three concrete picks from this month's launches, ranked by impact-per-hour of experimentation.

1. Switch your default chat to Gemini 3.5 Flash for two weeks. It is faster, agentic, and currently free in the Gemini app. If your daily LLM use is brainstorming, scripting, and analytical research, the speed-per-dollar is the most underrated change of May. Re-evaluate after two weeks against Claude Opus 4.7 for the long-form work it does not handle as well.

2. Run the same song brief through Suno V5 and ElevenMusic, pick by ear. Don't pick by reputation. The two systems have different aesthetics - Suno's vocals are denser and more polished, ElevenMusic's are cleaner and more controllable. Use Versely's music generator to A/B from the same prompt and let the use case decide.

3. Move your Sora workflows to Veo 3.1 inside the AI video generator before the API shutdown. September feels far away. It is not. Migration takes longer than expected because Veo's prompt grammar is different. Start now.

How Versely fits into May's launches

Versely's job is to be the orchestration layer that absorbs each new model release without making creators relearn their workflow. May's integrations:

The text-to-video tool routes between Veo 3.1, Kling 3.0, Seedance, and the Sora 2 fallback chain - the API stays the same as models arrive, deprecate, or change pricing.
The text-to-image tool added Nano Banana 2 as a fast default while preserving Flux Pro Ultra, Midjourney V7, Ideogram V3, and Imagen 4 as picks.
The music generator routes between Suno V5 and ElevenMusic based on whether you need ELO-winning vocals or licensed-from-day-one commercial safety.
The agentic chat layer now defaults to Gemini 3.5 Flash for speed and falls back to Claude Opus 4.7 for the prompts that need deeper reasoning.

The point of routing through Versely is that you do not have to track which model shipped on which Tuesday. You write the brief, pick the surface, and the right backend gets selected per task. That matters more in a month like May than in any other, because no single model wins every category.

FAQ

Which new model is best for video right now? Veo 3.1 inside Flow with the May updates is the new default for narrative video with native audio across all generation modes. Kling 3.0 stays the value pick for high-volume social work. Sora is shutting down - migrate.

What's the cheapest LLM right now? For chat and reasoning at frontier-adjacent quality, DeepSeek V4-Flash is the price floor for open weights and Gemini 3.5 Flash is the price floor for hosted closed models. xAI's Grok 4.3 at $1.25 per million input tokens is competitive when you need native video input.

Did open source catch up to closed models in May? On benchmarks, yes - DeepSeek V4-Pro is within 0.2 points of Claude Opus 4.6 on SWE-bench Verified under MIT license, and Qwen3.5-397B-A17B is comparable to GPT-5 and Claude Opus on visual reasoning. On polish, integration, and product experience, closed models are still ahead. The gap is narrowing month by month.

Is Sora 2 still the best video model? It is shutting down. The web and app experiences ended April 26 and the API ends September 24. Creators who valued Sora 2's physics and storyboard mode are mostly moving to Veo 3.1 or Kling 3.0. See our Sora 2 alternatives roundup for migration paths.

What did Versely add this month? Nano Banana 2 as a fast image default, GPT-5.5 Instant and Gemini 3.5 Flash in the chat routing pool, ElevenMusic alongside Suno V5 in the music generator, and updated Veo 3.1 prompt templates to take advantage of the new audio-across-all-modes capability inside Flow.

Closing takeaway

The pattern of May 2026 is the same pattern that defined April: too much shipped in too short a window for any one creator to track in real time. The right response is not to chase every release. It is to pick one default per category (chat, image, video, audio), give it two weeks of real use, then re-evaluate when the next big launch lands. That cadence matches how often frontier shifts actually happen. Everything else is noise. Treat the Versely tool suite as the layer that absorbs the churn so you can stay focused on the only thing models still cannot do for you: deciding what to make.