This is a full product case study of Listen2RE — the AI-augmented audio learning platform I designed and launched as CEO and AI Product Lead at Zerton Education Technologies.
35,000+ learners. 60% reduction in manual content effort. Built from 0 to production in under a year.
The problem
UPSC and MPSC aspirants in India face a brutal learning challenge: the syllabus is enormous, preparation takes years, and most aspirants have day jobs.
The standard solution — books, coaching classes, YouTube videos — requires screen-on, focused attention. But working aspirants commute 1–2 hours daily, often in conditions where phones can't be used for reading. Commute time goes entirely unused.
The insight: passive commute hours are wasted learning inventory. If we could convert dense UPSC content into high-quality audio, learners could use time they were already spending.
Existing audio content for UPSC? Largely low-quality YouTube with slow pacing and minimal structure. Nothing built specifically for audio-native learning.
The user
Working UPSC/MPSC aspirant, 24–34 years old, employed full-time, 1–3 hours/day available for exam prep, commuting 45–90 minutes daily.
Core frustration: "I know what I need to study. I just can't find the time to sit and study it."
Job to be done: make progress on UPSC preparation during time that would otherwise be completely wasted.
What Listen2RE actually does
Listen2RE is not a podcast or a text-to-speech app. It's an AI-augmented content system that:
- Ingests source material — UPSC syllabus content, current affairs, previous year questions
- Processes through an LLM pipeline — structured summarization, key concept extraction, audio-friendly rewriting
- Produces audio-native content — scripted, structured, paced for listening (not reading-aloud)
- Delivers a daily listening session — 15–25 minutes, topic-specific, scheduled for commute time
Key distinction: the content is designed for ears, not converted from text. A document designed to be read has different sentence structure and pacing than content designed to be heard. The LLM pipeline handles this translation.
The architecture
Source material (UPSC docs, current affairs, PYQs)
↓
Ingestion + chunking (Python)
↓
LLM processing (Claude API)
- Topic extraction
- Concept summarization
- Audio-script formatting
- Key term callout generation
↓
Quality review (human spot-check + LLM-as-a-Judge)
↓
Voice synthesis (TTS pipeline)
↓
Published to platform
Stack: Custom mobile-optimized web app · Claude API · n8n for pipeline orchestration · Mixpanel for engagement tracking · CDN for audio delivery
Key product decisions (and trade-offs)
Decision 1: AI-generated scripts reviewed by humans, not the reverse
Early approach: human experts wrote scripts, AI edited them. Problem: too slow, too expensive, couldn't scale.
Pivot: AI generates scripts, human expert does 10-minute spot-check. Result: 60% reduction in content production effort. Quality maintained through prompt engineering + LLM-as-a-Judge pre-filter.
Trade-off: occasional quality misses that human-first would catch. Acceptable because we built a user feedback loop (rate this session) that surfaces quality issues fast.
Decision 2: Daily push over on-demand library
Option A: Build a large content library, users browse and choose. Option B: Push one daily session at 6:30 AM, matched to the user's current topic.
We chose Option B. For habit-forming behavior, consistency beats choice. The paradox of choice applies strongly to exam aspirants — too many options creates decision fatigue and skipped sessions.
Trade-off: less perceived user control. Offset by allowing topic overrides and replay.
Decision 3: Mobile web over native app (initially)
The instinct was to build a native app immediately. We didn't.
Reason: ship speed. Web app was live in 8 weeks. Native app would have taken 6+ months.
Trade-off: no offline listening (the #1 feature request). Addressed in a subsequent iteration with progressive web app caching.
Metrics
| Metric | Result |
|---|---|
| Total learners served | 35,000+ |
| Content production effort reduction | 60% |
| Average session completion rate | 71% |
| User-reported "helps me use commute time" | 84% |
| LLM-as-a-Judge quality pass rate | 87% |
| Human review flag rate | 13% (down from 31% in month 1) |
| Cost per content session produced | Reduced 40% over 6 months |
What I'd do differently
1. Personalization earlier. The daily push is a blunt instrument. Topic-sequencing logic based on individual learner progress should have been month 2, not month 8.
2. Community layer sooner. Learners wanted to discuss content. We had no social layer for 14 months. A simple discussion thread per episode would have improved retention significantly.
3. Evaluation framework before launch. We shipped the first version without systematic LLM output evaluation. First month had quality inconsistencies that hurt early retention. The LLM-as-a-Judge pipeline should have been pre-launch infrastructure.
4. Voice quality was the real product. We underinvested in voice synthesis quality early. The jump in retention when we upgraded the voice pipeline was larger than any feature we shipped. Sound quality IS the product for an audio platform.
The broader lesson
Listen2RE taught me something I apply to every AI product I build:
The AI is invisible. The value is the outcome.
Learners don't care that an LLM is generating the scripts. They care that their commute now produces measurable UPSC progress. The AI is infrastructure. The product is the habit it enables.
This reframe changes everything about how you design AI products: you're not shipping an AI feature, you're shipping an outcome. Design around the outcome.
Sujit Chankhore is an AI Product Manager and founder based in Pune, India.