10 AI Projects to Learn Gemini 3 Nano Banana and Opus 4.5

ai-daily-brief-podcast

10 AI Projects to Learn New Models: Gemini 3, Nano Banana, and Opus 4.5

Overview

This episode of the AI Daily Brief provides a practical, project-based guide to exploring the wave of new AI models released in late 2025. The speaker (host of the AI Daily Brief, name not stated) argues that the recent model releases—GPT-5.1 series, Gemini 3, Nano Banana 2, Opus 4.5, and Grok 4.1—have collectively unlocked capabilities that were either impossible or impractical before. Rather than purely news analysis, the episode offers 10 hands-on weekend projects to help listeners develop working familiarity with these models.

Source video: URL not provided (AI Daily Brief podcast/video, published around Thanksgiving weekend 2025)


Prerequisites

  • Basic familiarity with large language models (ChatGPT, Claude, Gemini)
  • Some exposure to AI image generation tools
  • General awareness of “vibe coding” platforms (Lovable, Replit, Google AI Studio)
  • No programming experience required for most projects; a few advanced suggestions target developers or technical users
  • Familiarity with Notebook LM is helpful but not required

Main Points

Bonus/Project 0 — Install WhisperFlow for Voice Dictation

  • WhisperFlow (wisprflow.ai) is a voice-to-text tool that dramatically outperforms native macOS/iOS dictation
  • Works on desktop and mobile; activated via keyboard shortcut (e.g., Control + Option)
  • Speaker dictates at ~140 words per minute with automatic cleanup of rambling speech
  • Recommended as a universal productivity upgrade before undertaking any of the other projects
  • Integrating speech input changes which tasks feel worth doing and lowers the friction of giving AI rich context

Project 1 — Create an Infographic with Nano Banana

  • Nano Banana 2 (Google’s image generation model integrated with Gemini 3) can handle text-dense, information-rich infographics that no prior image model could produce
  • Key insight: because it is natively integrated with Gemini 3’s reasoning, multi-step tasks (e.g., summarise a podcast → produce infographic) can be completed without explicit step-by-step prompting
  • Suggested exercise: drop a work report, project summary, or proposal into Gemini 3 or Notebook LM and request an infographic
  • Speaker cautions that default outputs will quickly feel generic (“slop sense”); developing human taste and directional nudges will differentiate good infographics from average ones

Project 2 — Data Visualization with Gemini 3 + Nano Banana

  • Combining Gemini 3’s reasoning with Nano Banana’s visual output is described as “greater than the sum of the parts”
  • Suggested exercise: document intended weekly goals at the start of a week, then at week’s end give Gemini 3 access to your calendar (direct integration or screenshot) and ask it to visualize the gap between goals and actual time spent
  • Simpler variation (from Zara Zhang): upload a résumé and ask for a slide deck with competencies visualised as Venn diagrams—useful for understanding personal positioning
  • Both examples demonstrate that the real power is reasoning-informed visual output, not raw image generation alone

Project 3 — Image Editing with Nano Banana 2

  • Nano Banana 2 extends the precision editing capabilities introduced in version 1, enabling targeted changes (style, theme, specific elements) while preserving the overall composition
  • Suggested exercise: generate or supply an existing image and ask to change specific attributes (e.g., art style, seasonal theme) without regenerating from scratch
  • Example: changing a photorealistic Thanksgiving image to a cartoon style, or adding holiday theming to a pet adoption card, using minimal prompts
  • Speaker recommends mastering basic editing before attempting more advanced use cases

Project 4 — Explore New Notebook LM Features

  • Notebook LM’s Studio section has added video overviews, infographics (via Nano Banana), and slide decks
  • These features chain together: Notebook LM can analyse many uploaded sources, synthesise findings, and render them visually
  • Demonstrated example: 22 past “super intelligent audit” reports fed into a notebook → Notebook LM extracted average readiness scores and produced a thematic infographic and a slide deck (“AI Readiness Playbook”)
  • Suggested exercise for new users: pick a topic you know well, load sources from the web, generate all three output types (infographic, slide deck, video overview), and evaluate quality against your own knowledge

Project 5 — Strategic Planning and Reasoning with GPT-5.1

  • Speaker describes 5.1 as their favourite model for strategic thinking and “thinking out loud”
  • 5.1 offers four thinking modes: Light, Standard, Extended, Heavy (under Auto), plus a Pro mode for long-context synthesis
  • Recommended workflow:
    • Use Standard for initial planning and low-stakes exploration
    • Escalate to Extended or Heavy when facing hard prioritisation decisions
    • Switch to Pro mode to synthesise a long conversation into an actionable plan and shareable memos
  • Suggested exercise: use the approach of the new year to do annual/quarterly planning—give the model rich context (documents, analytics, or a 10-minute voice ramble via WhisperFlow), state goals (including unresolved ones), and force the model to make decisions with reasoning where needed, then produce an executable plan in Pro mode
  • Speaker notes 5.1 is noticeably less evasive about making definitive choices than previous ChatGPT models

Project 6 — Vibe Code a Personal Accountability Web App

  • All major vibe coding platforms (Lovable, Replit) now have access to Gemini 3, Opus 4.5, and related models
  • Suggested exercise: take the strategic plan from Project 5 and build a published web app around it—a personal accountability tool with a visual timeline, to-do structure, file upload capability, and recurring check-in notifications (email or push)
  • Goal is an end-to-end experience: from idea to published, interactive website, without touching GitHub
  • Vibe code password protection if privacy is a concern

Project 7 — Integrate Gen AI Features via Google AI Studio

  • Google AI Studio has been upgraded to simplify vibe coding of Gen AI apps, making Gemini API features (image generation, chatbots, image animation) directly accessible within a vibe coding workflow
  • Suggested extension of Project 6: add a NanoBanana-generated infographic that auto-renders weekly progress, or add a conversational voice agent that interviews the user each week
  • Speaker argues that conversational voice check-ins are meaningfully more powerful than passive email notifications for extracting context and refining plans
  • Non-technical vibe coders are described as “barely scratching the surface” of what is now accessible

Project 8 — Explore Replit’s Design Mode

  • Replit has released a design mode focused on visual prototyping, powered by new models
  • Produces markedly better visual output than the “sloppy purple interfaces and standard templates” previously associated with vibe-coded apps
  • Suggested exercise: use design mode on any vibe-coded project to experience the difference in visual quality
  • Available to try on the free plan

Project 9 (Bonus) — Use Claude Opus 4.5 for Coding and Complex Tasks

  • Opus 4.5 is described by early users as the best coding model available at time of recording
  • Key improvement: stays focused during deep, long coding tasks without losing context (“not getting lost in the sauce”)
  • More advanced users can replicate the web app projects using Claude Code + Opus 4.5 directly, bypassing platforms like Replit
  • Also claimed to be meaningfully better at complex spreadsheet work (e.g., Excel)
  • Claude-suggested vibe coding idea: a content repurposing hub—input a podcast transcript or video script and generate social posts, LinkedIn article, newsletter, tweet thread, and pull quotes for graphics

Key Concepts

  • Nano Banana 2: Google’s image generation model (second version), integrated natively with Gemini 3; supports photorealistic generation, precise editing, and text-dense infographics
  • Gemini 3: Google’s flagship multimodal reasoning model; its native integration with Nano Banana enables multi-step reasoning + visual output without explicit chaining
  • GPT-5.1 / 5.1 Pro: OpenAI’s latest reasoning model family; features graduated thinking depth (Light, Standard, Extended, Heavy) and a Pro mode optimised for long-context synthesis
  • Opus 4.5: Anthropic’s latest Claude model; described as leading for complex coding tasks and deep reasoning without context degradation
  • Grok 4.1: xAI’s latest model; mentioned as an alternative reasoning model for strategic planning tasks
  • Notebook LM: Google’s research assistant product; recently added Studio features including infographic, slide deck, and video overview generation using Nano Banana and Gemini 3
  • Vibe Coding: The practice of building functional software applications using natural language prompts to AI coding platforms, with no or minimal traditional programming
  • WhisperFlow: Third-party voice-to-text application (wisprflow.ai) that uses Whisper-based transcription to provide significantly more accurate dictation than native OS tools
  • Google AI Studio: Google’s developer platform for building AI-powered applications; recently upgraded to simplify integration of Gemini API features (image gen, voice agents, chatbots) within vibe coding workflows
  • Replit Design Mode: A recently released feature in Replit focused on producing high-quality visual UI prototypes, powered by new generation models
  • Native Multimodality: The property of a model that can reason across text, images, audio, and other modalities as part of a unified process rather than through separate pipeline steps
  • Content Repurposing Hub: A suggested vibe-coded application concept that transforms long-form content (podcast, video script) into multiple short-form distribution formats automatically

Summary

The speaker’s central message is that the cluster of major model releases in November 2025—spanning GPT-5.1, Gemini 3, Nano Banana 2, Opus 4.5, and others—has collectively expanded what is practically achievable with AI, and that the best way to internalise these new capabilities is through hands-on experimentation rather than passive observation. The 10 projects range from accessible entry points (installing WhisperFlow, editing a single image, generating an infographic) to more involved workflows (vibe coding a published web app with an embedded voice agent, using GPT-5.1 Pro for annual strategic planning). A consistent theme throughout is that the deepest value of these new models lies not in any single feature but in the integration of reasoning with multimodal output—particularly the Gemini 3 / Nano Banana pairing—and that developing personal taste and judgment in directing these tools will separate generic outputs from genuinely high-quality work.