Listen2RE: How I Built an AI Audio Learning Platform for 35,000+ Learners

A full product case study of Listen2RE — the AI-augmented audio learning platform I designed and launched for UPSC/MPSC aspirants. Problem, architecture, metrics, and lessons.

This is a full product case study of Listen2RE — the AI-augmented audio learning platform I designed and launched as CEO and AI Product Lead at Zerton Education Technologies.

35,000+ learners. 60% reduction in manual content effort. Built from 0 to production in under a year.

The problem

UPSC and MPSC aspirants in India face a brutal learning challenge: the syllabus is enormous, preparation takes years, and most aspirants have day jobs.

The standard solution — books, coaching classes, YouTube videos — requires screen-on, focused attention. But working aspirants commute 1–2 hours daily, often in conditions where phones can't be used for reading. Commute time goes entirely unused.

The insight: passive commute hours are wasted learning inventory. If we could convert dense UPSC content into high-quality audio, learners could use time they were already spending.

Existing audio content for UPSC? Largely low-quality YouTube with slow pacing and minimal structure. Nothing built specifically for audio-native learning.

The user

Working UPSC/MPSC aspirant, 24–34 years old, employed full-time, 1–3 hours/day available for exam prep, commuting 45–90 minutes daily.

Core frustration: "I know what I need to study. I just can't find the time to sit and study it."

Job to be done: make progress on UPSC preparation during time that would otherwise be completely wasted.

What Listen2RE actually does

Listen2RE is not a podcast or a text-to-speech app. It's an AI-augmented content system that:

Ingests source material — UPSC syllabus content, current affairs, previous year questions
Processes through an LLM pipeline — structured summarization, key concept extraction, audio-friendly rewriting
Produces audio-native content — scripted, structured, paced for listening (not reading-aloud)
Delivers a daily listening session — 15–25 minutes, topic-specific, scheduled for commute time

Key distinction: the content is designed for ears, not converted from text. A document designed to be read has different sentence structure and pacing than content designed to be heard. The LLM pipeline handles this translation.

The architecture

Source material (UPSC docs, current affairs, PYQs)
         ↓
Ingestion + chunking (Python)
         ↓
LLM processing (Claude API)
  - Topic extraction
  - Concept summarization
  - Audio-script formatting
  - Key term callout generation
         ↓
Quality review (human spot-check + LLM-as-a-Judge)
         ↓
Voice synthesis (TTS pipeline)
         ↓
Published to platform

Stack: Custom mobile-optimized web app · Claude API · n8n for pipeline orchestration · Mixpanel for engagement tracking · CDN for audio delivery

Key product decisions (and trade-offs)

Decision 1: AI-generated scripts reviewed by humans, not the reverse

Early approach: human experts wrote scripts, AI edited them. Problem: too slow, too expensive, couldn't scale.

Pivot: AI generates scripts, human expert does 10-minute spot-check. Result: 60% reduction in content production effort. Quality maintained through prompt engineering + LLM-as-a-Judge pre-filter.

Trade-off: occasional quality misses that human-first would catch. Acceptable because we built a user feedback loop (rate this session) that surfaces quality issues fast.

Decision 2: Daily push over on-demand library

Option A: Build a large content library, users browse and choose. Option B: Push one daily session at 6:30 AM, matched to the user's current topic.

We chose Option B. For habit-forming behavior, consistency beats choice. The paradox of choice applies strongly to exam aspirants — too many options creates decision fatigue and skipped sessions.

Trade-off: less perceived user control. Offset by allowing topic overrides and replay.

Decision 3: Mobile web over native app (initially)

The instinct was to build a native app immediately. We didn't.

Reason: ship speed. Web app was live in 8 weeks. Native app would have taken 6+ months.

Trade-off: no offline listening (the #1 feature request). Addressed in a subsequent iteration with progressive web app caching.

Metrics

Metric	Result
Total learners served	35,000+
Content production effort reduction	60%
Average session completion rate	71%
User-reported "helps me use commute time"	84%
LLM-as-a-Judge quality pass rate	87%
Human review flag rate	13% (down from 31% in month 1)
Cost per content session produced	Reduced 40% over 6 months

What I'd do differently

1. Personalization earlier. The daily push is a blunt instrument. Topic-sequencing logic based on individual learner progress should have been month 2, not month 8.

2. Community layer sooner. Learners wanted to discuss content. We had no social layer for 14 months. A simple discussion thread per episode would have improved retention significantly.

3. Evaluation framework before launch. We shipped the first version without systematic LLM output evaluation. First month had quality inconsistencies that hurt early retention. The LLM-as-a-Judge pipeline should have been pre-launch infrastructure.

4. Voice quality was the real product. We underinvested in voice synthesis quality early. The jump in retention when we upgraded the voice pipeline was larger than any feature we shipped. Sound quality IS the product for an audio platform.

The broader lesson

Listen2RE taught me something I apply to every AI product I build:

The AI is invisible. The value is the outcome.

Learners don't care that an LLM is generating the scripts. They care that their commute now produces measurable UPSC progress. The AI is infrastructure. The product is the habit it enables.

This reframe changes everything about how you design AI products: you're not shipping an AI feature, you're shipping an outcome. Design around the outcome.

Sujit Chankhore is an AI Product Manager and founder based in Pune, India.

Full portfolio → · LinkedIn →