In partnership with

What’s Going On?

Google did something this week that does not usually happen. They made their fast, cheap model beat their big, expensive one. Gemini 3.5 Flash launched at I/O on Tuesday and it is already the default engine inside the Gemini app and AI Mode in Google Search worldwide. If you opened Gemini today, you were already running it. The headline is the inversion. For years the rule was simple. The Pro model is smart and slow, the Flash model is fast and dumber. Gemini 3.5 Flash breaks that rule. It beats the older Gemini 3.1 Pro on the coding and agentic benchmarks that actually matter, runs roughly 4x faster than other frontier models, and lands near the top of the industry's intelligence-versus-speed charts. But there is a catch nobody on stage mentioned, and it is the part your wallet will feel. Google quietly tripled the price. We will get to that. First, what the model actually does.

ARTIFICIAL INTELLIGENCE
🌎 What Gemini 3.5 Flash Actually Is

Here are the verified numbers, because this is the part that matters.

It beats the old flagship. Gemini 3.5 Flash scores 76.2% on Terminal-Bench 2.1 versus 70.3% for Gemini 3.1 Pro. It hits 83.6% on MCP Atlas versus 78.2%. It scores 84.2% on CharXiv Reasoning for multimodal understanding. Every one of those numbers tops the older, bigger Pro model. That is the wild part. The cheaper, faster model is now beating last generation's flagship on the things developers care about most.

It is genuinely fast. Output runs at roughly 280 to 289 tokens per second. Artificial Analysis put it alone in the top-right quadrant of its intelligence-versus-speed index, calling it the only frontier model right now that combines top-tier intelligence with exceptional speed.

It is built for agents, not chat. Google describes it as "frontier intelligence with action." In plain terms, it is tuned to plan, call tools, spin up subagents, and grind through multi-step workflows without falling apart halfway through. This is a model designed to do work, not just answer questions.

It is fully multimodal. Text, image, video, audio, and PDF inputs. 1 million token context window. It scored 84% on MMMU-Pro, the highest multimodal score Artificial Analysis has ever recorded, putting Google in the top two spots for multimodal performance. Worth noting that Claude Opus 4.7, Grok 4.3, and GPT-5.5 all support image input only. Google is genuinely ahead on multimodal.

Where it loses. It trails on long-context retrieval and pure knowledge tests like Humanity's Last Exam. GPT-5.5 still wins on reasoning-heavy work and Terminal-Bench 2.0. The honest read across the board: Gemini 3.5 Flash is faster, cheaper per task, and best-in-class on multimodal. GPT-5.5 is stronger when the job is deep reasoning. They are tuned for different things.

🧠 The Price Hike Google Did Not Mention On Stage

Here is the part the keynote skipped.

Gemini 3.5 Flash costs $1.50 per million input tokens and $9.00 per million output tokens. The previous Gemini 3 Flash was $0.50 and $3.00. That is a 3x price increase on the same product line. And because this model burns more tokens working through agentic turns, Artificial Analysis found it cost roughly 5.5x more to run their full benchmark suite than the previous Flash. Both things are true at once. The model is sharply more expensive than its own predecessor, while still landing below frontier rivals like Claude Opus and GPT-5.5 on a per-task basis.

So here is the honest framing. If you are coming from a frontier model and paying Claude or GPT-5.5 rates, Gemini 3.5 Flash will save you money. The math works in your favor. But if you were budgeting on the old Gemini 3 Flash pricing and planning to migrate up, build in a 3x input cost increase before you do. Google is betting that the speed and the benchmark wins make the price jump invisible to most users. For the casual person using Gemini in the app, it is. For developers running agents at scale, it absolutely is not.

This is the quiet story of frontier AI right now. The models keep getting better and the headline pricing keeps looking cheaper than rivals, but the per-workload cost is creeping up because agentic models consume far more tokens than chat models ever did. The sticker price and the bill at the end of the month are drifting apart. Gemini 3.5 Flash is the clearest example yet.

From Our Partners

Where to Invest $100,000 Right Now, According to Experts

Investors face a dilemma. When the S&P 500 finished its worst quarter since 2022 last month, diversifiers like bonds and bitcoin fell too.

Even with the turnaround in mid-April, analysts at Goldman Sachs and Vanguard have projected low-single-digit annualized returns from 2024-2034.

Bloomberg asked where experts would personally invest $100,000 for their March monthly edition.

One answer that surfaced for a second time? Art.

It's what billionaires like Bezos and the Rockefellers have privately used to diversify for decades.

Why?

  1. Appreciation. The ArtPrice100 Index outpaced the S&P 500 overall from 2000 to 2025

  2. Low-correlation. The postwar contemporary segment has moved independently of traditional investments like stocks since ‘95.*

  3. Resilience. A scarce, physical, and global asset class with decades of demonstrated demand.

Thanks to the world's premier art investing platform, now anyone can invest in works featuring legends like Banksy, Basquiat, and Picasso, without needing millions.

Shares in new offerings can sell quickly but...

*According to Masterworks data. Investing involves risk. Past performance is not indicative of future returns. See important Reg A disclosures at masterworks.com/cd.

Industry Impact
Also Today: SpaceX Filed To Go Public 🚀

While Gemini was eating the AI headlines, SpaceX made its IPO filing public today. The S-1 is out. The company is listing on Nasdaq under the ticker SPCX, with a reported roadshow around June 4 and trading possible as early as June 11 or 12.

The scale is historic. Reports point to a raise potentially in the tens of billions and a valuation that could exceed one trillion dollars, which would make it one of the largest IPOs ever recorded. BlackRock is reportedly in talks to anchor the offering with a 5 to 10 billion dollar stake. Elon Musk will be at the center as CEO, CTO, and chairman.

Here is why it lands in an AI newsletter. The S-1 filing is full of AI bets. SpaceX plans to use Starship to launch orbital AI data centers into actual space, which is a use case written directly into the filing, not speculation. And because xAI was merged into SpaceX, Grok and the Colossus supercomputer effectively go public alongside the rockets. The same Colossus 1 facility that Anthropic signed a deal to use earlier this month is now part of a company filing for the biggest listing in history.

SpaceX wants an AI data centre… but floating around in space

SpaceX is also the third AI-adjacent IPO domino. Cerebras went public last week and nearly doubled. SpaceX filed today. OpenAI and Anthropic are both reportedly preparing offerings of their own. The AI IPO wave everyone predicted is no longer coming. It is here.

The Real Cost 🧮

Everyone's posting the benchmark wins. Almost nobody's posting the bill. So here's the part that actually matters if you run this model.

Google's pitch is "cheaper than frontier rivals." That's true on paper. Gemini 3.5 Flash is $1.50/$9 per million tokens. Claude Opus and GPT-5.5 cost more per token. Win for Google, right?

Here's the catch. Agentic models don't bill like chat models. When Flash plans, calls tools, spins up subagents, and grinds through a multi-step task, it burns input tokens on every single turn. Artificial Analysis ran the full benchmark suite and found Flash cost about 5.5x more than the previous Flash to actually complete the work — not because the per-token price went up 5.5x, but because the model consumes far more tokens doing agentic work.

So the real question isn't "what's the token price." It's "how many tokens does your workload actually burn." A simple chat app on Flash is cheap. An autonomous agent running 40-step workflows on Flash can cost more than you'd pay running the same job on a pricier model that finishes in 8 steps.

The takeaway: stop comparing models by their headline token price. Compare them by cost-per-completed-task on your actual workload. The cheapest sticker can be the most expensive bill. That's the trick the whole industry is quietly running right now, and Flash is just the clearest example this week.

What’s The Recap?

Gemini 3.5 Flash launched at I/O and is already the default model in the Gemini app and Google Search worldwide. It beats the older Gemini 3.1 Pro on coding and agentic benchmarks, runs 4x faster than frontier rivals, leads the industry on multimodal at 84% MMMU-Pro, and ships with a 1M token context window. The catch is the price. At $1.50 and $9.00 per million tokens it is 3x more expensive than the previous Flash and roughly 5.5x more costly to run a full workload, though still cheaper per task than Claude or GPT-5.5. Meanwhile SpaceX filed to go public today on Nasdaq under SPCX, potentially the biggest IPO in history, with orbital AI data centers written into the filing and xAI's Grok and Colossus supercomputer going public alongside it. A model that beats its own flagship while quietly tripling its price. A rocket company filing the largest listing ever with AI baked into the pitch. The AI economy is maturing fast, and the fine print is where the real story lives now.

Quick Links:

Gemini 3.5 Flash Announcement 👉 Here

Stay building. 🤖

Check Out Our Latest YouTube Video

Recommended for you