Khabar: Building a Daily AI Radio Show on a Mac Mini

#ai #automation #local-first #tts #workflow

Every morning at 8 AM, something happens in my home office that feels a little magical. My Mac Mini wakes up, scans my email for AI newsletters, researches the week’s most interesting stories, writes a Wikipedia-style wiki with proper citations, assembles a conversational radio script, and reads it aloud — locally, privately, without touching a single API for the audio.

This is Khabar (Hindi for “news”), and it took a few days of evenings and a few more bugs than I’d like to admit.

The Problem: Newsletters Are a Fire Hose

I subscribe to a lot of AI newsletters — AINews, TLDR, AlphaSignal, and a few others. They’re great, but there’s a constant tension: reading them takes time I don’t have, and skimming means missing nuance.

I wanted something different. A morning radio briefing. 15-20 minutes of a knowledgeable friend talking me through the week’s most interesting developments — not a summary, but a narrative. With context. With interesting tangents. With citations I could follow up on later.

The Constraint: Budget Hardware, Privacy-First

I’m running this on an entry-level Mac Mini. No GPU. No budget for OpenAI API calls every morning. The full pipeline has to run locally or on already-configured services I pay for anyway.

This shaped every decision:

  • Research/brain: Delegated to AI agents with good APIs (MiniMax, ZAI)
  • Voice/TTS: Had to be local — Piper TTS with a small ONNX model
  • Storage: Zettelkasten-style wiki in Obsidian, synced to Google Drive, audio in a local folder
  • Cost: Zero marginal cost per run

The Architecture

The pipeline flows in five stages, triggered by a single cron job at 8 AM IST:

  1. Collect — Sync Gmail via dak, filter newsletters from the last 48 hours
  2. Research — Delegate URL extraction and 100-word summaries to parallel sub-agents
  3. Wiki — Consolidate into a Zettelkasten-style LLM wiki with backlinks and citations
  4. Script — Generate a conversational radio script with host markers, transitions, and pauses
  5. Audio — Parse the script, generate TTS via Piper, stitch with ffmpeg, deliver

A Word on dak

The pipeline starts with email. I use dak — a local-first email metadata bridge I built. The name is Hindi: dak (डाक) means mail. It’s a thin SQLite layer over gog CLI, which talks to Gmail via OAuth. Newsletters arrive, get synced locally, and the pipeline picks them up from there. No browser, no web UI, no giving another service access to my inbox.

The Wiki: Your Own Private Wikipedia

The research doesn’t just feed the radio show — it builds a persistent knowledge base. Every article gets fetched, summarized to 100 words, and stored as an atomic Zettelkasten note. Each note captures the source newsletter, the original link, and a cleaned-up summary.

The wiki lives in Obsidian and syncs to Google Drive, so I can browse it from any machine. A company like Anthropic accumulates a profile page. A topic like AI agent governance gets its own note with key takeaways and related links. The daily timeline is the index — linking back to every topic and company mentioned that day.

Here’s what a topic note looks like:

# AI Agent Governance & Orchestration

**Date:** 2026-04-17
**Sources:** [Turing Post](https://www.turingpost.com/p/galileo5) | [AgentMail](https://console.agentmail.to/sign-up)

## Summary
Multi-agent AI governance is emerging as a critical challenge.
Galileo and CrewAI are addressing production governance — enforcing
safety policies, steering agents to optimal models, controlling costs,
and ensuring compliance.

## Key Takeaways
- Execution-bound admissibility: re-evaluating every action in real time
- 5-point control layer: who, delegation, action, context, allow/block
- Runtime interception vs. commit authority distinction
- Galileo released 165-page "Mastering Multi-Agent Systems" guide

## Related Topics
- [[gemma-4]] — orchestration-focused model design
- [[ai-agent-orchestration]] — broader orchestration patterns

And the daily timeline anchors everything:

# Timeline - 2026-04-20 (Sunday)

## Overview
Heavy news week dominated by Claude Opus 4.7's release, open-source
model momentum, and the maturing AI agent ecosystem.

## Major Stories
### 1. Claude Opus 4.7 Dominates (Apr 17)
Anthropic released Opus 4.7 with improved vision, new tokenizer, and
xhigh effort tier. Pricing controversy as top tiers become significantly
more expensive.
- 📝 [[claude-opus-47]]

### 2. Gemma 4: Orchestration Over Parameters (Apr 19)
Google released Gemma 4 under Apache 2.0, positioning orchestration
as the new competitive moat for AI applications.
- 📝 [[gemma-4]]

## Sources
- AINews, TLDR, AlphaSignal, The Neuron, Turing Post, Everyday AI, AgentMail

The wiki is the secondary output — and arguably the more durable one. The radio show plays once. The wiki compounds.

The Seven Bugs

Building this revealed seven distinct failure modes:

1. Email truncation. gog gmail read truncates bodies above a certain length. Solution: pipe to files, then extract the full text. Some newsletters also use Beehiiv/Substack tracking redirects — fallback to direct web search for article URLs.

2. The chipmunk voice (first time). Piper TTS outputs at 22050 Hz. The script concatenated it with 44100 Hz silence segments using ffmpeg concat -c copy. The mismatched sample rates caused a 2x speed-up, making the host sound like an anxious chipmunk. Fix: explicit aresample=44100 filter on all segments before concatenation.

3. The chipmunk voice (second time). I tried to slow the voice down by setting speed < 1.0 in Piper. Turns out this also distorts pitch — same chipmunk effect, different cause. Lesson: run Piper at 1.0x always; if you need slower speech, use ffmpeg atempo after generation.

4. Storage sprawl. The pipeline was leaving behind raw audio WAVs, intermediate JSON, URL lists. A 10-minute show was generating 200 MB of artifacts. Fix: write everything to /tmp/ during processing, only persist the final MP3 and wiki entries. Added auto-cleanup to keep only the last 7 shows.

5. Duration too short. First few shows were 3-4 minutes. Turns out concise newsletter summaries don’t fill a radio show. The fix wasn’t to pad — it was to generate more expansive, commentary-heavy scripts with natural transitions between stories rather than bullet-point narration.

6. XTTS v2 voice cloning rejected. XTTS would have given a more natural voice with voice cloning. But it requires accepting a commercial CPML license interactively, which can’t be automated in a cron context. Piper was the right call — lightweight, fast, and surprisingly warm with the en_US-lessac-medium voice model.

7. Cron delivery fragmentation. Initially had two separate cron jobs — one for the wiki, one for the radio show. They stepped on each other and duplicated work. Consolidated into one unified job that handles the full pipeline.

What I Learned

Local-first audio is viable. Piper TTS on a CPU-only Mac Mini is fast enough for a 15-minute show in under 2 minutes of TTS time. The en_US-lessac-medium voice has a warm, slightly radio-host quality that works well for long-form narration.

Delegation isn’t free. Sending 10+ URLs to sub-agents for parallel research is powerful, but you need good prompts and explicit output formatting. Vague delegation leads to messy data that requires cleanup.

Hardware constraints are a design tool, not an obstacle. Every limitation — no GPU, zero API budget for TTS, limited storage — forced a cleaner architecture. The hybrid approach (smart models for research, local for audio) is probably better than if I’d had unlimited budget.

Pacing matters more than content. The longest pauses in the show are between sections. That silence — 1.5 to 2 seconds — is what makes it feel like a show rather than a text-to-speech reading.

The wiki is the real product. The radio show is the delivery mechanism. The knowledge base is what’s left behind. By the time this runs for a year, I’ll have a searchable, linkable record of everything significant in AI — sourced, summarized, and mine.

The Current State

Khabar runs every morning at 8 AM. Shows average 8-12 minutes depending on the news cycle. The wiki accumulates as a searchable long-term knowledge base, synced to Google Drive. The audio goes to a dedicated folder, last 7 days retained.

The goal of 15-20 minutes is still aspirational — it depends on the richness of that week’s news, not on padding the script. And that’s fine.

The whole thing costs me $0 in API calls. The Mac Mini idles at 3W. And every morning, I wake up to a radio show about AI, tailored to my interests, built entirely on my own hardware.

That’s a good morning.