Gemini (Google) - capabilities

The long-context heavyweight and image powerhouse of the big three: reach for Gemini when you need to feed in an entire document or generate slide-grade visuals. A standard member of our three-model toolkit alongside Claude and ChatGPT. Our view from 500+ client engagements; capabilities evolve quickly.

Best at

Huge context windows (~1M tokens) - reading entire documents end-to-end
Full-context injection alongside Claude for long memos and contracts
Image generation via Nano Banana Pro (a reasoning image model, 4K, editable)
Accessible programmatically via the AI Studio API + MCP
Strong general reasoning (Gemini 3.x) in the frontier pack

Capability snapshot

Capability	Verdict	What that means
Long documents / context	✅ Leads	Our go-to for ~1M-token context and true full-context reading of whole documents, not just scanning.
Image generation	✅ Strong	Nano Banana Pro "thinks before it creates" and follows prompts precisely - slide-grade visuals.
AI agents / API	✅ Strong	Exposes an AI Studio API he wires into MCPs so agents can generate images on command.
Overall capability	✅ Strong	A genuine frontier model - he recommends trying it for a month just to experience what's possible.
Reasoning	🟡 Capable	Strong in the frontier pack; he also uses it as the live example of how any LLM can guess wrong on hard maths.
Cross-model use	🟡 Capable	Run next to Claude and ChatGPT as a standard multi-model practice (challenge them against each other).
Workspace integration / grounding	🟡 Capable	Part of the Google ecosystem; direct workshop evidence on Workspace/deep research is thinner.

In Wouter's words

Gemini is actually very good at huge context limits.

What is so special about Gemini Nano Banana Pro is that it's a reasoning image model - it thinks before it creates, which means it follows your prompt much more precisely.

Watch-outs

Like all LLMs, it guesses on complex calculations - he uses Gemini as the live example of a confidently-wrong answer on hard maths.
Big context is double-edged: don't dump everything in, or the model loses context awareness; truly massive inputs still hit a limit.
Image editing degrades after about four or five edit iterations on a slide.
Our evidence on Workspace integration is thinner than for Claude/ChatGPT - treat those as general practice, not hard Gemini-specific claims.

Our take

We treat Gemini as a genuine frontier model and a default member of our three-model toolkit - the one to reach for when context size matters, because it actually reads every page rather than skimming. We're especially keen on Nano Banana Pro as a reasoning image model for slide-grade visuals, and on the AI Studio API that lets us wire Gemini into agents and MCPs. The honest caveat: big context only helps when you're disciplined about what you feed it, and like any model it can confidently guess wrong on hard maths.

Just Gemini - I can tell you it's mind-blowing what's possible these days.

Wouter van Haaften, WAIMAKERS

Gemini (Google) - capabilities

Best at

Huge context windows (~1M tokens) - reading entire documents end-to-end

Full-context injection alongside Claude for long memos and contracts

Image generation via Nano Banana Pro (a reasoning image model, 4K, editable)

Accessible programmatically via the AI Studio API + MCP

Strong general reasoning (Gemini 3.x) in the frontier pack

Capability snapshot

Watch-outs

Like all LLMs, it guesses on complex calculations - he uses Gemini as the live example of a confidently-wrong answer on hard maths.

Big context is double-edged: don't dump everything in, or the model loses context awareness; truly massive inputs still hit a limit.

Image editing degrades after about four or five edit iterations on a slide.

Our evidence on Workspace integration is thinner than for Claude/ChatGPT - treat those as general practice, not hard Gemini-specific claims.

Our take

Just Gemini - I can tell you it's mind-blowing what's possible these days.

Wouter van Haaften, WAIMAKERS

Gemini

Need help navigating AI?

Gemini

Need help navigating AI?