Hey there, tech enthusiasts! Welcome to our cozy corner of the digital universe. Today, we’re exploring vector databases—those unsung heroes that quietly power our favorite AI applications. So, grab your virtual coffee, and let’s dive in!
Vector databases form the backbone of machine learning and artificial intelligence applications. Unlike traditional databases that deal with structured data, these databases store and manage large amounts of high-dimensional data in a vector embedding format, enabling efficient storage, retrieval, and processing of complex information.
But what’s a vector, you ask? A vector in the context of a vector database refers to an embedding-based representation of an object, such as images, audio, and text, commonly used in machine learning tasks. These vector representations are nothing but high-dimensional numerical arrays that capture the essential features or characteristics of the objects. Let's take an example of an image of a cat. The vector representation of such an image could capture features such as the shape of its ears, the color of the fur, the color of its eyes, the pattern on its coat, and the size of its whiskers. These vector embeddings can be generated using Deep Neural Network models such as Convolutional Neural Networks for images and word2vec and BERT for textual data.
So far, we've learned that vector databases can be used to store the high-dimensional embedding of any object. But how are these vector databases useful? When prompted with a search query of a vector representation of an image or audio, vector databases can quickly retrieve embeddings from the database in a way similar to the prompt query. Vector DBs use models such as the Approximate K-Nearest Neighbor approach (internally using similarity methods such as cosine similarity or Euclidean distance) to find similarities between embeddings.
Industrial examples of Vector DB for image search include Amazon's usage of OpenSearch service [1]. Amazon uses the OpenSearch Vector Search Collection as a vector database for image searches, enabling users to query the search engine with rich media like images. The implementation is similar to semantic search, where deep learning models, such as ResNets, are used to convert images into vector embeddings. OpenSearch provides efficient vector similarity search by offering specialized indexes and supports a scalable engine that can handle vector search at low latency and up to billions of vectors.
Another example is Spotify's Voyager released in December 2023. Voyager is an open-sourced Vector DB for Speech Recognition that enables similarity search of music tracks on in-memory collections of vectors, succeeding Annoy as Spotify’s recommended nearest-neighbor search library for production use. This allows Spotify to recommend new songs to users based on their listening preferences and also helps identify and eliminate duplicate tracks.
The growth of chatbots has seen a major boost after the advancement of generative LLM models like OpenAI's GPT-3 and Facebook's Llama. Generative AI has enabled chatbots to engage in more natural and contextually relevant conversations, providing personalized experience for users and in some cases has been observed to be more efficient at problem resolution than a human agent.
However, one of the key challenges in building chatbots is ensuring that they provide accurate and relevant responses to user queries. This is where RAGs come into play. Retrieval Augmented Generation (RAG) is a method used to enhance the reliability of generative AI chatbots. Essentially, RAG combines the power of generative models and an external knowledge base to improve the quality and relevance of responses generated by chatbots. By integrating with an external knowledge base, RAG addresses the issue of "hallucinations" in generative LLMs—cases where the model produces a plausible but incorrect answer. This can occur when, for example, you ask ChatGPT to create an itinerary of your Barcelona trip, and it tells you to visit imaginary museums that do not even exist!
One really cool application of this is Stack Overflow's implementation of an intuitive search experience where they modeled Stack Overflow questions and answers as embeddings using a pre-trained BERT model and used Weaviate, an open-source Vector DB, for storage, retrieval, and fetching similarity between user-provided search queries and Stack Overflow results.
Ever wondered how e-commerce websites can recommend products so precisely personalized to your taste that you end up ordering items you didn't even know you wanted? E-commerce websites use product embeddings to personalize product recommendations. These embeddings are created based on the characteristics and relationships of the products and the order history of millions of other users.
In the embedding space, items that are frequently purchased together or share similar features are placed closer to each other, indicating a higher similarity between them. The types of data used for creating these embeddings can include purchase activity or co-rating similarity, where products rated similarly by users are considered alike.
Once the embedding model is trained, it can be utilized to generate personalized recommendations. When a user interacts with the system, their behavior and preferences are used to generate the user's embedding. Amazon uses OpenSearch vector DB [3] to store all product embeddings and find similarities between the user's embedding and those of products in the database. Products with embeddings closest to the user's encoding are then recommended accordingly.
So far, we learned what vector databases are and how they're used. Now let's explore some examples of vector databases:
Several traditional databases have added support for handling and querying high-dimensional vector data, including:
In this era of AI-enabled applications, vector databases form the backbone of numerous applications that we interact with on a daily basis. These include music recommendations in our playlists, chatbots that can be AI assistants that can solve our problems, and human agents. But there's so much more! The synergy between vector databases and Deep Learning models sets the stage for a future where AI truly understands the nuances of human language, visual input, and sound. Imagine:
The potential is immense, and the possibilities are truly limitless. As vector databases and deep learning models continue to become more robust and refined, the lines between what machines can understand and how humans interact with information will continue to blur. We're one step closer to a future where AI-powered applications augment our abilities, leading to a better understanding of our world.
Disclaimer: The author is completely responsible for the content of this article. The opinions expressed are their own and do not represent IEEE's position nor that of the Computer Society nor its Leadership.