I run MacWhisper on my laptop, and often dump podcast MP3s into it, extract the Whisper transcript and then feed that through a long context model like Claude 3 Haiku/Opus or Gemini Pro 1.5/Gemini Flash using my https://llm.datasette.io/ tool to answer questions against that transcript.