In this episode of the Don't Panic, It's Just Data podcast, Kevin Petrie, VP of Research at BARC and the podcast host, is joined by Dainius Jocas, Search Engineer at Vinted, and Radu Gheorghe, Software Engineer at Vespa.ai. They discuss how Vinted, an online marketplace for secondhand products, modernised its data architecture to address new AI search use cases and the challenges faced with Elasticsearch.
From the switch to Vespa and the advantages of supporting multiple languages and complex queries, the podcast offers insights on the trade-offs organisations must think about when updating their search systems, especially regarding AI and machine learning applications.
Vinted Elasticsearch Challenges
Vinted’s search architecture was built on Elasticsearch before they switched to Vespa. Elasticsearch is a functional system that presents a few major challenges. With over 20 supported languages, the company's "index per language" approach created significant sharding problems, leading to infrastructure imbalances and constant adjustments.
"The index for the French language, the biggest language that we support, was more than three times bigger than the second biggest language, which created imbalances in the Elasticsearch data nodes' load," Jocas explained.
In addition to these technical obstacles, organisational issues arose as teams responsible for different parts of the search process found themselves "pointing fingers at each other at an increasing rate." The need for a more integrated, effective solution became clear.
The Solution: A New Platform for a New Era
The search for a better solution led Vinted to Vespa. The initial adoption was a "one success story" when a machine learning engineer, working on recommendations, discovered that Vespa was ten times faster than Elasticsearch for their use case.
This initial benchmark, run on a single decommissioned server, was a "true testament to how efficient Vespa is when it comes to serving requests,” Jocas told Petrie.
Vespa helped Vinted solve their language problem by allowing it to set a language per document. Thus, it eliminates the need for separate indexes and the associated sharding headaches. As Jocas put it, "We got out of the sharding problem once and for all."
Takeaways
- Vinted faced challenges with its initial Elasticsearch architecture.
- The need for better integration between matching and ranking was identified.
- Vespa outperformed Elasticsearch in handling image search and recommendations.
- Transitioning to Vespa involved significant learning and support from developers.
- Vespa allows for language-specific document handling, simplifying architecture.
- Organisations must evaluate the complexity and volume of their data before transitioning.
- Vespa is optimised for query performance, while Elasticsearch excels in data writing.
- The learning curve for Vespa can be steep, but support is available.
- It's important to focus on optimising new systems rather than emulating old ones.
- Partial updates in Vespa are more efficient than in Elasticsearch.
Chapters
- 00:00 Introduction to Vinted and Vespa
- 02:05 Vinted's Initial Data Architecture Challenges
- 06:40 Common Trade-offs in Modernising Search Architecture
- 10:21 Transitioning to Vespa: Key Steps and Lessons Learned
- 14:11 Supporting Multiple Languages with Vespa
- 15:44 Vespa's Architecture and Its Flexibility
- 16:48 Use Cases: Vespa vs. Elasticsearch
- 20:34 Advice for Organisations Modernising Search Architecture
- 23:00 Final Thoughts on Transitioning to New Technologies
About Vespa.ai
Vespa.ai is an AI Search Platform for building and operating large-scale RAG, recommendation, and personalisation systems. It unifies data, inference, and ranking in a single query path for real-time results over massive datasets. Vespa delivers high throughput and low latency, and is available as a managed service.
About Vinted
Vinted.com is an online marketplace making second-hand a first choice. It connects buyers and sellers of pre-owned clothing, electronics, and home décor in a community-driven space. Active in 21 European markets plus the USA, Vinted serves 120 million users and 77 million monthly visitors. Its advanced search and recommendation engines handle 1 billion items and process 25,000 searches every second, with over 10,000 real-time data updates per second to keep the experience fast, relevant, and engaging.
Comments ( 0 )