AI
Gemini
Google's multimodal model family.
Gemini's strength is multimodal — long video, long audio, and tight integration with the Google stack (Drive, Workspace, NotebookLM). I use it in pipelines that need to ingest the messy real world.
- Best-in-class video + audio understanding
- NotebookLM for source-grounded research
- Tight Workspace integration for backoffice work
Why
Multimodal is where Gemini wins outright — long video, long audio, deep Workspace integration. If the pipeline has to ingest the messy real world, this is the model that does it cheapest.
How
- Gemini for video/audio ingestion + Workspace ops
- NotebookLM for source-grounded research bundles
- Hand off generation to Claude or specialised image/video models
Proof
- Hours of audio ingested
- 200+
- NotebookLM workflows
- PAD · LAD · VAD
- Workspace integrations
- Drive · Gmail · Docs