Case StudiesAI Platform

Podfolio

Podcast Content Was Sitting on 4.2 Million Hours of Business Intelligence. Nobody Had Built the Infrastructure to Use It.

Every podcast episode is a conversation between people who know things. Guests reveal pain points, software stacks, revenue figures, growth challenges - in their own words, on the record, searchable if you know how to search. We built Podfolio: three interconnected products that turn the podcast industry's raw audio into a commercial operating layer.

AI PlatformVector SearchLLMAudio IntelligenceB2B Lead GenerationPodcast Tech

Speaker-diarized transcription

4.2M+ episode semantic index

Three products, one data layer

Engagement profile

4.2M+

Transcribed episodes indexed and searchable

92%

Guest-to-host match score using vector similarity

Interconnected products. One ecosystem.

Product View

PDF Page 13 - Podfolio CRM AI Landing Page

Stop losing leads. Turn guest interviews into high-ticket sales pipelines, with speaker diarization, pain point extraction, and structured JSON output.

Visual Placeholder

The Origin

Three markets, one underlying data-layer problem.

Podfolio wasn't built for a single client with a single problem. It was built around a structural gap in an entire industry - one that became visible once you looked at podcasting not as a media format but as a data layer.

Podcast hosts spend thousands of dollars and hundreds of hours producing conversations with guests who are, in many cases, exactly the kind of leads their business needs. Those conversations contain real intelligence - the guest mentions their tech stack, their biggest operational challenge, the revenue milestone they just crossed. That intelligence lives in audio files that nobody searches, on platforms that return keyword matches rather than meaning, managed by hosts who have no system for acting on what they heard.

On the other side: brands, agencies, and sales teams who need to reach niche audiences are spending heavily on advertising that targets demographics instead of conversations. The podcast industry had a matching problem, a lead generation problem, and a search problem - all rooted in the same underlying issue. Audio is the most information-dense medium most businesses produce, and almost none of that information is structured for use.

"Every episode I record, I'm sitting across from someone who could be a client, a partner, or a referral source. And I had no way of knowing which ones, because I couldn't search what I'd said or what they'd said. It was all just audio sitting in a folder."

The Problem

The podcast industry had solved distribution. It hadn't solved intelligence.

Apple and Spotify solved the discovery and distribution problem for listeners. They didn't solve the intelligence problem for hosts, brands, or sales teams. Once an episode was published, the content inside it became effectively unsearchable, the guests inside it became untracked, and the business signals inside it became invisible. Three distinct industries - podcast hosting, B2B lead generation, and media sponsorship - were all sitting on the same untapped layer and none of them had the infrastructure to use it.

Pain point 01

Hosts were booking guests blind

A podcast host looking for their next guest had no systematic way to find someone whose expertise genuinely matched their audience's interests. The process was referral-based, LinkedIn-based, or PR-pitch-based - all of which favour guests with large existing audiences over guests with the most relevant expertise. The match quality suffered. The outreach was manual. And the pitch, when it finally arrived, was generic.

Pain point 02

Interview content was generating leads nobody was capturing

A guest who mentions their enterprise churn problem, their Salesforce dependency, and their $5M ARR in a 40-minute interview has just self-qualified as a prospect for a dozen different products. The host had the conversation. The content existed as audio. And none of that signal was being captured, structured, or acted on. The lead walked out of the recording studio and disappeared.

Pain point 03

Sponsorship targeting was demographic, not contextual

Brands paying to reach podcast audiences were targeting by listener count and broad demographic - the same metrics that drove display advertising in 2010. A brand selling enterprise software was sponsoring shows with "business" audiences, not shows where CEOs had specifically discussed the problem the software solved. The contextual match that would make sponsorship genuinely effective wasn't possible without the ability to search episode content at scale.

Pain point 04

There was no Google for podcast audio

Natural language queries about podcast content returned keyword matches at best. "Find me every episode where a SaaS founder discussed churn" returned nothing useful because the content was in audio, the transcripts were partial or absent, and the search infrastructure had been built for titles and descriptions - not for meaning. The information existed. The retrieval layer didn't.

What they tried

Podcast hosts had tried managing guest relationships in generic CRMs not built for the booking workflow. Brands had tried manual research and PR agencies for sponsorship placements. Sales teams had tried keyword searches across transcript databases that returned volume without relevance. Each partial solution produced partial results - and none of them addressed the underlying problem, which was that podcast audio had never been treated as structured, queryable business data.

Key Insight

Podcast audio isn't media. It's the most honest business intelligence most companies produce - and it's been sitting unindexed.

When a guest speaks on a podcast, they're not in a sales call, a press release, or a prepared statement. They're having a conversation. They say what they actually think about their tools, their challenges, their numbers. That candour, at scale across millions of episodes, is a data layer that no other medium produces. The insight that shaped Podfolio was simple: if you could index that layer properly and build the right retrieval and matching infrastructure on top of it, you'd have three distinct products serving three distinct markets - and they'd all be drawing from the same underlying data.

Our Approach

Three products. One data architecture underneath all of them.

Discovery

Before designing any individual product, we mapped the full commercial landscape of the podcast industry - who the participants were, what each of them needed, and where the same underlying data could serve multiple use cases simultaneously. The guest-host matchmaking product, the CRM lead generation product, and the search and intelligence product all run on the same ingestion pipeline, the same transcript index, and the same vector database. The architecture decision to unify the data layer was made before the first product feature was specified. It's what made three products possible without three separate infrastructure builds.

Design philosophy

Every product feature was designed around one question: does this transform raw audio into something actionable? Transcription alone is not a product - it's a prerequisite. Speaker diarization alone is not intelligence - it's structure. The intelligence layer is what you build on top: entity extraction that identifies pain points and revenue signals, vector embeddings that enable semantic matching rather than keyword matching, LLM-drafted outreach that references specific episode content. The pipeline from audio to action was the design brief, and every component was evaluated against how well it served that pipeline.

The Solution

Three products. Built to serve every commercial layer of the podcast industry.

The Match

Guest & Host Matchmaking Network

A dual-sided marketplace where AI curates the introduction rather than leaving it to referral networks and PR pitches. The platform builds deep expertise profiles from each guest's RSS feed and LinkedIn history - understanding not just their job title but the specific topics they speak about with depth and consistency. When a host is looking for a guest, the system returns matches scored by cosine vector similarity against the host's own episode content, with match scores up to 92%. The outreach pitch is drafted by the LLM - not generically, but referencing specific episode topics the guest has covered that align with the host's audience.

Product View

PDF Page 14 - Matchmaking Pipeline

Profile Creation -> The Match -> The Pitch -> The Connection, with AI Pitch Writer panel and 92% match score.

Visual Placeholder

Sub-feature

AI Pitch Writer

The LLM doesn't draft a template. It reads the host's recent episodes, identifies the specific topics where the guest's expertise intersects, and writes an outreach message that references that intersection directly. A pitch that opens with "Loved your thoughts on AI in healthcare in Episode 14" and connects it to the guest's decade of MedTech experience converts differently from a pitch that opens with a job title and a follower count.

Product View

PDF Page 14 - AI Pitch Writer Panel

Draft email with highlighted episode reference for AI in healthcare and contextual guest-host fit.

Visual Placeholder

The Pipeline

Podfolio CRM AI

Every guest interview is a sales conversation waiting to be processed. Raw audio from guest interviews goes through Deepgram and Whisper for speaker-diarized transcription - each speaker's words separated and attributed. The AI then continuously scans the transcript for business signals: pain points, software dependencies, revenue metrics, growth challenges. A guest who says "yeah, our biggest issue right now is high churn rate - we're on Salesforce but it's too clunky for our $5M ARR" generates a structured lead record: pain_point: "High Enterprise Churn", software_stack: ["Salesforce"], revenue_metric: "$5M ARR". That record flows into a visual Kanban pipeline - Leads -> Booked -> Closed Won - with an automated 48-hour follow-up email sequence drafted from the interview context.

Product View

PDF Page 14 - CRM Kanban Dashboard

Leads, Booked, Closed pipeline with smart follow-up sequencing and contextual deal cards.

Visual Placeholder

Sub-feature

CRM Automation Pipeline

The five-stage pipeline - Post-Interview -> AI Analysis -> CRM Entry -> Auto Follow-up -> Nurturing - runs without manual input between stages. A host records an interview. The audio is uploaded. The AI identifies the leads, creates the CRM entries, and schedules the follow-up emails. By the time the episode is edited, the outreach is already in motion.

Product View

PDF Page 14 - CRM Automation Pipeline

Five-stage operational flow from interview upload through nurturing without manual handoffs.

Visual Placeholder

The Index

Podfolio Search & Insights

A searchable index of 4.2M+ transcribed episodes, built from a distributed scraper that continuously ingests top podcasts from Apple and Spotify charts. Natural language queries don't return keyword matches - they decode intent and sentiment, returning timestamped audio hits and contextual quotes from the exact moment in the episode where the relevant content appears. A query like "find me every podcast with an audience over 5,000 that discussed Custom Web Development positively in the last 30 days" returns structured results with episode links, timestamps, and quote context. Enterprise users export bulk lead lists to CSV or connect directly via the Podfolio Graph API.

Product View

PDF Page 15 - Search & Insights Landing

Semantic search UI with live query processing and global scraper orchestration across podcast sources.

Visual Placeholder

Sub-feature

Context Engine

The context engine surfaces not just that a topic was mentioned - but where, by whom, in what tone, and with what surrounding content. "47 Contextual Hits" for a given query comes back with the exact timestamp, the podcast name, the episode number, and the verbatim quote. You skip to the minute where someone praised your industry, not the episode where they mentioned it once in passing.

Product View

PDF Page 15 - Context Engine (47 Hits)

Timestamped quotes with highlighted terms and side-by-side relevance context across shows.

Visual Placeholder

Sub-feature

Export & API Layer

Enterprise users don't just search - they integrate. Bulk lead export to CSV and direct Graph API access means the intelligence Podfolio surfaces feeds into whatever CRM, outreach tool, or data pipeline the client already runs.

Product View

PDF Page 15 - Export & API Layer

Structured JSON output and CSV export controls for direct downstream integration.

Visual Placeholder

Tech stack

Next.jsPythonVector DBDeepgramWhisperOpenAIRedisPostgreSQLGraph API

How It Was Built

One ingestion pipeline. Three products drawing from it simultaneously.

Approach note

The architecture decision that made Podfolio possible was treating the transcript index as shared infrastructure rather than per-product data. Every episode ingested by the scraper - transcribed, diarized, embedded, and indexed - is available to all three products. The matchmaking network reads the same transcript embeddings that the search product queries. The CRM AI processes the same diarized transcripts that the search index stores. Building the data layer once and building three product interfaces on top of it was the decision that made the scope achievable - and that makes the ecosystem coherent for users who move between products.

Challenge

Transcription accuracy at 4.2 million episode scale

Whisper and Deepgram produce excellent transcriptions under controlled conditions. At 4.2M+ episodes, you encounter every possible audio quality scenario - variable recording setups, non-native English speakers, heavy domain-specific vocabulary, cross-talk between host and guest. We built a transcript quality scoring layer that flagged low-confidence segments for secondary processing and used speaker diarization confidence scores to route ambiguous attribution to a resolution queue. The goal wasn't perfect transcription - it was transcription that was good enough for semantic search and entity extraction, which have different accuracy requirements than verbatim accuracy.

Challenge

Cosine similarity alone doesn't make a good match

A vector similarity score tells you how close two embedding spaces are. It doesn't tell you whether a guest with genuine expertise in a topic would be a good fit for a host whose audience expects a particular level of depth, tone, or perspective. We built a multi-signal matching layer on top of the raw similarity score - incorporating episode engagement data, guest's historical topic consistency, host's typical episode structure, and audience demographic signals. The 92% match score reflects a composite signal, not a single similarity calculation.

Challenge

Entity extraction that distinguishes signal from noise

A podcast guest who says "we use Salesforce" is providing context. A guest who says "we use Salesforce but it's too clunky for our $5M ARR and we're actively evaluating alternatives" is providing a sales signal. Building an entity extraction layer that distinguishes those two cases - that identifies not just the software mention but the sentiment, the stated problem, and the implied intent - required fine-tuning the extraction prompts against a corpus of real interview transcripts labelled by humans who understood B2B sales contexts. The structured JSON output that makes the CRM AI useful depends entirely on extraction quality. We treated it as the most important engineering problem in the product.

Results

An untapped data layer. Three products that make it useful.

4.2M+

Transcribed episodes in the searchable index

92%

Peak match score on guest-host recommendations

48hrs

Automated follow-up sequence triggered post-interview

Commercial layers of the podcast industry served by one data architecture

The hosts who adopted the matchmaking network described the same shift: the quality of their guest pipeline improved because the selection criteria improved. Instead of booking guests who were available and willing, they were booking guests whose specific expertise matched what their audience had responded to historically. The AI wasn't replacing the host's judgment - it was giving the judgment better inputs.

For sales teams using the Search & Insights product, the most significant change was the shift from demographic targeting to contextual targeting. Finding a podcast where a CEO specifically discussed enterprise churn - not a podcast with a "business" audience, but the specific episode where that specific problem was named - produced a targeting precision that demographic data couldn't replicate.

The CRM AI produced the clearest commercial outcome: leads that had previously existed as audio and disappeared now existed as structured records with automated follow-up sequences. The interview didn't end the relationship. It started a pipeline.

What We Learned

Build the shared layer right, and every product surface gets stronger.

Learning

Audio is structured data - it just hasn't been treated that way. Once you structure it, the applications multiply.

Every product in the Podfolio ecosystem is an application of the same underlying insight: spoken conversation, properly transcribed, diarized, and semantically indexed, is as queryable as any database. The infrastructure investment to get there - transcription at scale, vector embeddings, entity extraction - pays across multiple use cases simultaneously. We built it once. Three products drew from it. That ratio is only possible if you treat audio as data from the first architectural decision.

Learning

Semantic search requires a fundamentally different index than keyword search - and the difference is invisible until you try to build it.

A keyword search index stores terms. A semantic search index stores meaning. The infrastructure to build and query a vector database at 4.2M+ episode scale is categorically different from the infrastructure to build a keyword index of the same content. Teams that try to add semantic search to an existing keyword infrastructure consistently underestimate the scope of the rebuild. We started with the semantic architecture and let it determine the stack, rather than adapting an existing stack to support it.

Learning

In a multi-product ecosystem, the quality of the shared data layer determines the quality of every product built on it.

The matchmaking product's match quality, the CRM's extraction accuracy, and the search product's result relevance all flow from the same source: the transcript index and the embedding pipeline. An error in transcription degrades all three. An improvement in entity extraction improves all three. Building the shared layer with the rigour of production infrastructure - not a prototype - was the decision that made the product quality consistent across all three surfaces.

Sitting on content that should be working harder?

Most businesses produce more intelligence than they capture - in calls, interviews, meetings, and conversations that live as recordings nobody searches. We've built the infrastructure to change that. If you have audio you're not using and leads you're not capturing, that's a conversation worth having.

Talk to our team See more work