2h ago

Booking.com Scales Vector Search To 100M Embeddings With Weaviate

01056823

——0——

Original post

Most companies talk about vector search. Few share what it actually takes to scale to 100M+ embeddings in production. Başak Eskili from @bookingcom joined the Weaviate Podcast to break down their AI journey, and it's packed with insights about what building production systems at massive scale actually looks like. 𝗧𝗵𝗲 𝗘𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻: • Started with keyword matching → semantic retrieval with 𝗢𝗽𝗲𝗻𝗦𝗲𝗮𝗿𝗰𝗵 on AWS • Scaled to hundreds of millions of embeddings with strict latency requirements • Migrated to 𝗪𝗲𝗮𝘃𝗶𝗮𝘁𝗲 to handle complex filtering, rising concurrency, and production-scale demands 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗚𝗲𝗻𝗔𝗜 𝗶𝗻 𝗔𝗰𝘁𝗶𝗼𝗻: Their partner-to-guest messaging agent is a real-world example of 𝗮𝗴𝗲𝗻𝘁𝗶𝗰 𝗔𝗜: • 𝗪𝗲𝗮𝘃𝗶𝗮𝘁𝗲 retrieves relevant response templates • 𝗔𝗣𝗜𝘀 fetch property and booking context • The agent suggests templates, crafts grounded replies, or defers to humans (human-in-the-loop design!) • Evaluation spans offline datasets, LLM-as-a-judge, A/B testing, and live partner feedback @CShorten30 and Başak talk about how 𝗕𝗼𝗼𝗸𝗶𝗻𝗴.𝗰𝗼𝗺 tested with 100 million embeddings, filtered vector search, multi-threaded concurrency, reads during writes, and cost-efficient infrastructure provisioning to evaluate Weaviate, as well as a look ahead at personalized travel agents with memory systems that capture user preferences, session context, and long-term personalization! Watch the full podcast here: https://www.youtube.com/watch?v=O9edM9ZS_FQ

3:56 AM · May 26, 2026