Skip to content
← Back to projects

YouTube Audio Summarizer

Multi-platform video summarizer using Whisper + Gemini. Supports YouTube, Bilibili, and more.

Tweakerclaude-codepythonwhisper

What I Built

A CLI tool that takes any YouTube or Bilibili video URL, downloads the audio, transcribes it with Whisper, and generates a structured summary using Gemini. Useful for processing long podcasts, lectures, or conference talks.

Tools Used

  • -Claude Code -- Built the pipeline end-to-end
  • -Python -- Core language
  • -Whisper -- Audio transcription (local, free)
  • -Gemini -- Summary generation
  • -yt-dlp -- Multi-platform video downloading

How I Vibe Coded It

Told Claude Code: "build me a tool that summarizes YouTube videos by downloading the audio, transcribing it, and summarizing the transcript." It set up the full pipeline in one go: yt-dlp for downloading, Whisper for transcription, Gemini for summarization.

The multi-platform support (Bilibili, etc.) came from yt-dlp supporting hundreds of sites out of the box. I didn't have to write any platform-specific code.

Key Lessons

  • -Chain existing tools, don't reinvent. yt-dlp + Whisper + Gemini = a powerful pipeline with zero custom ML code.
  • -CLI tools are underrated first projects. No frontend to design, no deployment to worry about. Just run it and get results.
  • -Good for entry-level vibe coders. This is a great first "real" project because the output is immediately useful.