AI

Gemini

Google's multimodal model family.

Gemini's strength is multimodal — long video, long audio, and tight integration with the Google stack (Drive, Workspace, NotebookLM). I use it in pipelines that need to ingest the messy real world.

  • Best-in-class video + audio understanding
  • NotebookLM for source-grounded research
  • Tight Workspace integration for backoffice work
Why

Multimodal is where Gemini wins outright — long video, long audio, deep Workspace integration. If the pipeline has to ingest the messy real world, this is the model that does it cheapest.

How
  • Gemini for video/audio ingestion + Workspace ops
  • NotebookLM for source-grounded research bundles
  • Hand off generation to Claude or specialised image/video models
Proof
Hours of audio ingested
200+
NotebookLM workflows
PAD · LAD · VAD
Workspace integrations
Drive · Gmail · Docs