Smart Search — Semantic Discovery
Turn misspelled queries, vague descriptions, and natural‑language questions into instant “that’s exactly what I wanted” moments. Our vector‑powered search engine reads intent—not just keywords—slashing zero‑result pages, doubling conversion for searchers, and driving a 60 % jump in search‑attributed revenue.
Industry
E‑commerce & SaaS Search · Knowledge Management · Gen‑AI
Service
Vector‑Search Architecture · Gen‑AI Retrieval + Rerank · Multilingual NLP · Search Merchandising Suite
Team Setup
1 Product Lead · 2 Search Architects · 4 ML Engineers · 3 Frontend Devs · 2 Data Scientists · 2 QA Specialists · 3 DevOps
Timeline
9 Months
Story
Goal
Build an intent‑aware, sub‑second search experience that would:
- Cut zero‑result rate by ≥ 20 % (Dropbox saw 17 % with semantic search).
- Boost searcher conversion at least 2× over site average.
- Auto‑surface answers, categories, and bundles—no manual synonym lists.
- Deliver results in < 150 ms P95 across 10 language locales.
Challenge
Keyword legacy and massive catalogs hindered accurate results:
- Keyword legacy — Boolean match missed slang & compound queries.
- Catalog sprawl — 1.4 M SKUs, 12 languages, daily delta > 50 K items.
- Zero‑result rage — 9.6 % sessions ended empty; bounce soared to 42 %.
- Cold‑start vectors — new items lacked embedding context.
- Facet explosion — 7 K attributes needed dynamic ranking.
- Latency budget — success required < 150 ms including rerank.
Our Approach
Discover
Shadowed merchandisers & ran 80 K anonymised query audits; found 53 % were natural‑language or misspelled.
Design
Prototyped hybrid BM25 + vector recall; user‑tested semantic facets; tuned embeddings on domain vocabulary.
Deploy
Deployed on GCP GKE; real‑time embedding via GPU autoscale; blue/green index swaps nightly.
Challange
The Mountain to Climb
Staging a global vector index that’s always fresh and always fast:
1.4 M‑item vector index
Replicated across three regions for global performance.
Streaming ingest < 45 s
From PIM change to searchable in near real‑time.
Multilingual query → English pivot
M2M‑100 handles on‑the‑fly translation to EN embeddings.
Compliance: GDPR
“Right to be forgotten” must remove data within 60 s.
Per‑tenant synonyms & rules
Multi‑client architecture overlays brand preferences.
Facet relevance learning
No large labeled data sets; incremental feedback approach.
Additional Hurdles
Scale ingestion from 50 K daily SKUs; no re‑index meltdown.
Nightly blue/green index swaps; sub‑second failover needed.
Ensure data security, multi‑tenant separation, & compliance logging.
These complexities pushed us to engineer a resilient, multilingual search stack that meets every brand’s operational and compliance demands.
Key Modules Engineered
Covering recall, rerank, personalization, and beyond—every module orchestrates frictionless discovery.
Vector Recall Engine
Approx‑NN (HNSW) + BM25 hybrid; recall +38 %.
Semantic Spell & Stemming
Understands “nikes for trail runnin’.”
Intent Classifier
Detects FAQ vs SKU lookup → dynamic widgets.
Rerank (LLM)
GPT‑4o reranks top 50; MRR +12 %.
Dynamic Facets
Learns facet order per query; CTR +18 %.
Zero‑Result Rescue
Embedding‑expansion fallback; ZRR −31 %.
Multilingual Pipeline
10 locales, pivot → EN embeddings.
Auto‑Synonym Miner
Mines click‑logs for new synonyms nightly.
Query Analytics Hub
Heat‑map & long‑tail gold mine for merch teams.
Personalised Boosting
Vector merge with user profile; CR +9 %.
Voice & Vision Search
ASR for voice, CLIP embeddings for images.
Edge‑Cache Accelerator
CDN‑side candidate cache; 40 % queries served in 20 ms.
User Research Insights
Shoppers who use on‑site search convert up to 50 % higher than average. Home
67 % abandon after two failed searches; semantic rescue retained 42 % of them.
“I love that it understands eco‑friendly running shoes without filters.”
— beta tester
Technology Stack
A/B - Test Wins
ROI / Business Impact
Outcome
From guesswork to guided discovery—one search engine now powers global multi‑brand revenue lifts.
Revenue & Growth
- Searcher conversion 2.1 × site average, adding $84 M GMV.
- Long‑tail sales +33 % from semantic expansion.
Shopper Experience
- Zero‑result rate −31 %, bounce −22 %.
- Average time‑to‑first‑click 1.2 s → 0.6 s.
Operational Efficiency
- Merch team freed 10 hrs/week—no manual synonym work.
- Catalog ingestion pipeline now 45 s vs 15 min baseline.
Brand Impact
- Featured by Algolia as “Top Semantic Search Launch 2025.”
- Net Promoter Score on search widget +16 pts.
Feature Highlights
Vector Recall Engine
LLM Rerank
Zero‑Result Rescue
Dynamic Facets
Auto‑Synonym Miner
Multilingual Pivot
Personalised Boost
Voice & Vision Search
Edge‑Cache Accelerator
Query Analytics
Intent Classifier
Image‑first Embeddings
Real‑time Inventory Tie‑in
Merch Rules GUI
Security + GDPR Purge
Want to turn every query into a sale?
Book a discovery call—we’ll plug in your catalog, stand up semantic search, and prove the uplift in a 30‑day pilot.