- World of AI
- Posts
- Baidu DROPS an OPEN-SOURCE AI that Rivals GPT-5! 🤯
Baidu DROPS an OPEN-SOURCE AI that Rivals GPT-5! 🤯
Meet “ERNIE 4 Vision” — The Multimodal BEAST Taking on GPT-5 & Gemini 2.5 Pro 🚀

⚙️ The Big Drop
Baidu just shook the AI world.
Their new open-source multimodal model claims to outperform GPT-5 and Gemini 2.5 Pro — and it’s free for commercial use.
“ERNIE 4 Vision” isn’t just a text model — it sees, reasons, and builds like a hybrid between Gemini and Claude Sonnet.
It’s designed for developers who want full-stack control of their AI workflows without paying OpenAI-scale costs.
🧠 What Makes It Different📊
28 B parameters, but activates only ~3 B per task for faster performance.
Handles images + text + documents natively — perfect for agent workflows, RAG, or AI research tools.
Apache 2.0 License → ✅ commercial use ✅ custom fine-tuning ✅ no API lock-ins.
Beats GPT-4 Turbo on OCR, visual reasoning, and multi-document understanding tasks.
📊 Benchmark Results: ERNIE vs The World
Baidu backed up its claims with serious data — the model doesn’t just compete, it beats Gemini 2.5 Pro and GPT-5 High in 10 of 15 benchmarks.

🚀 ERNIE 4.5-VL-28B-A3B-Thinking outperforms Gemini 2.5 Pro and GPT-5 High in most STEM, chart-understanding, and video-reasoning tasks — while remaining a compact, efficient model.
Highlights from the chart:
🧮 MathVista: 82.5 vs 82.7 (Gemini 2.5 Pro) and 81.3 (GPT-5 High)
📈 DocQA (var): 95.3 — highest among all three models
📊 Chart QA / A1D: ERNIE leads both Gemini and GPT-5 by > 3 points
🧩 Average score: ERNIE 73.1 → Gemini 70.3 → GPT-5 69.4
In short: ERNIE 4 Vision isn’t just open — it’s competitive at the very top.
Build AI APPS And AGENTS With NO-CODE!
Get 20$ Of FREE CREDITS! - Click Image Below
The Simplest Way to Create and Launch AI Agents and Apps
You know that AI can help you automate your work, but you just don't know how to get started.
With Lindy, you can build AI agents and apps in minutes simply by describing what you want in plain English.
→ "Create a booking platform for my business."
→ "Automate my sales outreach."
→ "Create a weekly summary about each employee's performance and send it as an email."
From inbound lead qualification to AI-powered customer support and full-blown apps, Lindy has hundreds of agents that are ready to work for you 24/7/365.
Stop doing repetitive tasks manually. Let Lindy automate workflows, save time, and grow your business
🚀💻 Why Devs Care
This isn’t another lab demo. It’s a production-ready open model with real power.
Devs can now:
Build multimodal copilots that parse images, PDFs & dashboards.
Train local enterprise assistants with visual comprehension.
Run lightweight inference on GPUs without API bottlenecks.
Baidu’s message?
“If OpenAI and Google won’t give you control — we will.”
🧠 The Bottom Line
Baidu didn’t just release a model — they declared war on closed AI.
A 28 B parameter beast that sees, reads, and builds — all open source.
The AI arms race just turned multimodal and open.
AI freedom has arrived — and its name is ERNIE 4 Vision. 🐉⚡


Reply