Meet Marqo, an open source vector search engine for AI applications
Vector databases are the unsung heroes of the modern AI movement, storing unstructured data such as images, videos and text to enable people and systems to search uncategorized content. They are particularly integral to large language models (LLMs) such as GPT-4 (which powers ChatGPT), owing in large part to the databases’ ability to power real-time indexing and search as the data is created or updated — this is important for personalization features, recommendation systems, sentiment analysis, and more.
The snowballing demand for generative AI has thrust myriad vector database startups into the spotlight, securing bucketloads of cash en route. In April alone, we saw Pinecone and Weaviate raise $100 million and $50 million, respectively, to grow their vector database smarts, while the same month fledgling vector database upstarts Chroma and Qdrant secured $18 million and $7.5 million, respectively, in seed financing. And late last year, Zilliz, the core developer behind the Milvus open source vector database, locked down $60 million in funding.
So it’s clear that companies working on helping infrastructure keep apace with the AI hype train are in big demand, something that Australian startup Marqo is now looking to capitalize on with a more holistic “end-to-end” approach to vector search.
Pain point
Founded out of Melbourne last June, Marqo is the brainchild of Jesse Clark, formerly lead machine learning scientist at Amazon’s robotics unit in Seattle, and Tom Hamer, previously a database software engineer for Amazon Web Services (AWS) in Sydney.
With Marqo, the crux of their mission is to solve the conundrum that is unstructured data, which constitutes up to 90% of all data that is created, according to some estimates. As more people turn to generative AI to answer their online queries or create new images and artwork, this only intensifies the need for new tools to make sense of it all.
A core selling point of Marqo, versus the existing incumbents, is that it promises a full array of vector search smarts out of the box — this includes vector generation, storage, and retrieval. This means that Marqo allows its users to bypass third-party vector-generation tools from the likes of OpenAI or Hugging Face, to offer everything via a single API.
“Vector search is difficult to implement — vector databases are only one part of the puzzle, and developers find it challenging to bring all of the required components together to build a vector based search experience with optimal relevance, latency and reliability,” Marqo co-founder and CEO Tom Hamer explained in an email to TechCrunch. “Marqo provides an end-to-end system that brings all of these components together solving a major pain point for developers.”
Moreover, search systems are only as good as the results they generate, which means that relevance, accuracy and “up-to-dateness” are integral to any information storage and retrieval systems. And this is something that Hamer said that Marqo offers off the bat.
“If developers want to continually improve relevance of search results, they have to manually train new AI models for vector generation,” he continued. “Marqo’s continuous learning technology will allow search to automatically improve based on user engagement — such as clicks, ‘add to cart’ and so on — this is particularly important for ecommerce and other end-user search use cases.”
Marqo had raised £660,000 ($840,000) in pre-seed funding last year, and today it announced a fresh $4.4 million in seed funding as it looks to double down on its commercial efforts. This includes a new cloud service that’s formally launching to the public today to complement the existing open source Marqo project.
The open source factor
Like many of its competitors, Marqo’s open source ethos was a very deliberate move to ingratiate itself to the developer community, who are able to tinker and tailor the product to figure out whether it’s for them. In turn, this means they may be more likely to recommend the product to the powers-that-be at their company, and even contribute to product development.
“I strongly believe that the development of open source products leads to a higher quality outcome,” Hamer said. “Building Marqo on an open source foundation allowed us to have a tight feedback loop with our users and iterate extremely fast to build the product that developers actually need. Open source is also a great customer acquisition channel. Customers can see exactly what they are buying, they can try it for free and make sure that Marqo is right for their use case.”
That all said, open source typically requires a lot of resources to execute a production-grade product, both in terms of human input and infrastructure. And that is where Marqo Cloud enters the fray.
“Self-hosting the open source product is a great option for users that don’t require real-time search and have a small number of end users, or for building a proof of concept,” Hamer continued. “Marqo’s Cloud platform handles the infrastructure, maintenance, and operations of the cloud resources for our customers, ensuring optimal performance and cost efficiency.”
Marqo’s seed round was led by Australian VC Blackbird Ventures, with participation from Creator Fund, both of which invested in the pre-seed round. For the latest round, Marqo also attracted investments from January Capital and Cohere co-founders Ivan Zhang and Aidan Gomez.