-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmediumblog1.txt
407 lines (257 loc) · 15.9 KB
/
mediumblog1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
Search
Write
Emarco
Vector Database: What is it and why you should know it?
Ejiro Onose
Ejiro Onose
·
Follow
9 min read
·
Dec 22, 2023
100
“If 2021 was the year of graph databases, 2023 is the year of vector databases” — Chip Huen.
Why? you ask...
Well, it’s because Generative AI and Large Language Models (LLMs) are now popular and one of the best ways to handle LLM data is with a vector database because vector databases provide the ideal infrastructure for managing the complex, high-dimensional data that LLMs produce and thrive on.
In this article, I’ll discuss what vector databases are, how they work, and some excellent vector database tools you should check out.
Before we dive into vector databases, let’s first understand what a vector is.
What is a Vector?
In machine learning (ML), a vector is a collection of numerical values that represent the characteristics or features of multi-dimensional objects such as words, images, etc.
source
For example, a vector representing an image might contain values corresponding to the pixel intensities of the image, in the order of the image’s color channels.
Source
What are Embeddings?
An embedding is a technique for representing complex data, such as images, text, or audio, as numerical vectors.
These embeddings capture the essence of the data and show clearly the semantic similarity (or relationship) between different objects, with similar objects having vectors that are close to each other in the vector space. Thus, ML algorithms allow them to be efficiently processed and analyzed.
source
ML models often generate embeddings as part of their training process. For LLMs, an embedding model is put in place to create the embeddings
Embeddings are vectors that represent the essential features of a data point. For example, a natural language processing model might generate embeddings for words or sentences.
Embeddings can be used for a variety of tasks, such as clustering, classification, and anomaly detection. Vector databases can be used to store and query embeddings efficiently, which makes them ideal for ML applications.
To see what embeddings look like, check out this Vectorizer created by Kenny, it converts texts to embeddings.
Note: An embedding is a vector representation, but not all vectors are embeddings.
Putting it all together, let us define a vector database.
What is a Vector Database?
A vector database is a type of database that stores and manages unstructured data, such as text, images, or audio, in high-dimensional vectors, to make it easy to find and retrieve similar objects quickly at scale in production.
They work by using algorithms like vector similarity search to index and query vector embeddings,
The importance of vector databases in LLM projects lies in their ability to provide easy search, high performance, scalability, and data retrieval by comparing values and finding similarities between them
Vector database’s search capabilities can be used in various applications ranging from classical ML use cases, such as recommender systems, to providing long-term memory to large language models in modern applications, to text understanding, video summarization, drug discovery, stock market analysis, and much more.
As data continues to grow in complexity and volume, the scalability, speed, and accuracy offered by vector databases position them as a critical tool for extracting meaningful insights and unlocking new opportunities across various domains.
What are the benefits of Vector Databases?
Here are some specific reasons why vector databases are so well-suited for LLMs and generative AI:
Handling Massive Data Loads
Vector databases can handle the massive amounts of data that are generated by LLMs and generative AI. Traditional databases might struggle with the millions or even billions of data points produced in a single run, but vector databases are purpose-built to handle such large datasets with efficiency.
Efficient Similarity Searches
Vector databases can find data that is similar to a given query vector. This is essential for tasks such as image search and content recommendation, which are often used in conjunction with LLMs and generative AI. For example, if you are using an LLM to generate a new image, you can use a vector database to find other images that are similar to the generated image.
Integration with ML Algorithms
Vector databases can be integrated with machine learning algorithms. This makes it easy to use vector databases to train and evaluate machine learning models. For example, you can use a vector database to store the data that is used to train a model, and then use the vector database to search for the data that is most relevant to the model.
Handling Vector Embeddings
Vector databases provide a superior solution for handling vector embeddings by addressing the limitations of standalone vector indices, such as scalability challenges, cumbersome integration processes, and the absence of real-time updates and built-in security measures.
List of Some Top Vector Databases
There are several vector database solutions available in the market, each with its own set of features and capabilities. Some of the top vector database solutions include:
Weaviate
Pinecone
Chroma DB
Qdrant
Milvus
Here’s an overview of some of the features of these vector databases. You can go see this comprehensive vector database features matrix by Dhruv Anand
Source: Author
Weaviate
Weaviate is an open-source vector database that can be used to store, search, and manage vectors of any dimensionality. It is designed to be scalable and easy to use, and it can be deployed on-premises or in the cloud.
Features:
Weaviate can store and search vectors from various data modalities, including images, text, and audio.
Weaviate provides seamless integration with machine learning frameworks such as Hugging Face, Open AI, LangChain, Llamaindex, TensorFlow, PyTorch, and Scikit-learn.
Weaviate can index vectors in real-time, making it ideal for applications that require low-latency search.
Weaviate can be scaled to handle large volumes of data and high query throughput.
Weaviate can be used in memory for fast search or with disk-based storage for larger datasets.
Weaviate provides a user-friendly interface for managing vectors and performing searches.
Pinecone
Pinecone is a fully managed cloud-based vector database that is designed to make it easy for businesses and organizations to build and deploy large-scale ML applications.
Some Pinecone Features:
Pinecone is designed to be fast and scalable, allowing for efficient retrieval of similar data points based on their vector representations.
It can handle large-scale ML applications with millions or billions of data points.
Pinecone provides infrastructure management or maintenance to its users.
Pinecone can handle high query throughput and low latency search.
Pinecone is a secure platform that meets the security needs of businesses and organizations.
Pinecone is designed to be user-friendly and accessible via its simple API for storing and retrieving vector data, making it easy to integrate into existing ML workflows.
Pinecone supports real-time updates, allowing for efficient updates to the vector database as new data points are added. This ensures that the vector database remains up-to-date and accurate over time.
Pinecone can be synced with data from various sources using tools like Airbyte and monitored using Datadog
Chroma DB
Chroma DB is an open-source vector store for storing and retrieving vector embeddings. It is mainly used to save embeddings along with metadata to be used later by LLMs and can also be used for semantic search engines over text data.
Chroma DB offers a self-hosted server option and supports different underlying storage options like DuckDB for standalone or ClickHouse for scalability.
Chroma DB offers two memory modes:
The in-memory mode
The persistent memory
The in-memory mode is used for rapid testing, providing proof of concept (POC) and querying, allowing the reuse of collections between runs.
The persistent memory allows users to save and load data to and from a disk, causing the persistence of the database beyond the current session. This allows for the addition and deletion of documents after collection creation, and it is essential for production use cases where an in-memory database is not sufficient.
Some of the key features of Chroma DB are:
Chroma DB supports different underlying storage options like DuckDB for standalone or ClickHouse for scalability.
It provides SDKs for Python and JavaScript/TypeScript and focuses on simplicity, speed, and enabling analysis.
Chroma can store vectors from various data types, including text, images, and audio.
Qdrant
Qdrant is an open-source vector database and vector search engine that provides fast and scalable vector similarity search services with additional payloads.
Here are some of the features of Qdrant:
Qdrant offers support for disk-stored collections, as storage space is cheaper than memory. It has introduced the Scalar Quantization mechanism recently, which makes it possible to reduce the memory requirements by up to four times.
Qdrant allows users to express more complex conditions for nested structures.
It provides an asynchronous I/O interface that reduces overhead by managing I/O operations asynchronously, thus minimizing context switches.
It uses distance metrics to measure similarities among vectors, and they must be selected at the same time you are creating a collection.
It can be used with the Python quadrant client, which provides a convenient API to store, search, and manage points (i.e., vectors) with an additional payload.
Milvus
Milvus is an open-source vector database that is designed for similarity searches in dense vector datasets containing millions or even billions of vectors. Milvus vector database adopts a systemic approach to cloud-nativity by separating compute from storage and allowing you to scale both up and out.
Milvus docs
Here are some of the features of Milvus:
Milvus uses a distributed architecture that separates storage and computing, allowing for horizontal scalability in computing nodes.
Milvus can be scaled to handle trillions of vectors and millions of queries per second.
Milvus supports various data types, and it provides enhanced vector similarity search with attribute filtering, UDF support, configurable consistency level, time travel, and more
Milvus can handle high query throughput and low latency searches.
To help users try Milvus quicker, Bin Ji, a top contributor to the Milvus community, developed Milvus Lite, a lightweight version of Milvus. It can help you get started with Milvus in minutes, while at the same time offering many benefits.
Milvus provides a user-friendly interface for managing vectors and performing searches.
How To Choose The Right Vector Database For Your LLM Projects
To choose the right vector database for LLM projects, there are some factors you should consider. They include:
Scalability: Since LLMs generate and consume vast amounts of vector data, it is best to choose a database that can efficiently store and manage large-scale datasets without compromising performance. Also, the vector database must be able to seamlessly handle future data additions and expansion of your LLM project’s scope.
Performance: It should deliver fast query execution and swift retrieval of relevant vectors. It should also efficiently handle multi-dimensional queries and complex similarity searches.
Security: The database should provide robust security features, including encryption, access controls, and authentication mechanisms. For use cases with personal or sensitive data, the vector database should align with applicable privacy regulations.
Cost: Using LLM APIs already costs a fortune when running at scale, so you look out for a vector base. with flexible pricing models and one that fits your use case.
Query interfaces: Evaluate the ease of interaction with the database, including available query languages, APIs, and user interfaces.
Deployment options: Make sure the vector database whether cloud-based, on-premise, or hybrid solutions matches your infrastructure preferences and data sensitivity.
Integration capabilities: Ensure seamless integration with your existing LLM infrastructure and other tools in your workflow.
Yep! That’s it.
Vector databases are generally useful when building LLM applications because they serve as an abstraction layer for handling and managing LLM data.
Let me know what you think about the vector database tools mentioned here and I hear there are some really interesting vector db out there. If you have used any, please comment on what your experience was like.
Happy Building!
Vector Database
Llm
Embedding
Vector
100
Ejiro Onose
Written by Ejiro Onose
115 Followers
Still working on this....
Follow
More from Ejiro Onose
Microservice Orchestration Best Practices
Ejiro Onose
Ejiro Onose
in
Ambassador Labs
Microservice Orchestration Best Practices
In this article, you’ll discover nine microservice orchestration best practices, the importance of orchestration, and more …
8 min read
·
May 3, 2023
368
3
LLMs on Trial: Understanding LLM Evaluation Benchmarks
Ejiro Onose
Ejiro Onose
in
Artificial Intelligence in Plain English
LLMs on Trial: Understanding LLM Evaluation Benchmarks
LLM evaluation benchmarks provide a more comprehensive way of testing LLMs on various tasks, allowing us to get a better understanding of…
11 min read
·
Feb 14, 2024
85
1
Using Knative and Ambassador Edge Stack to Handle Traffic
Ejiro Onose
Ejiro Onose
in
Ambassador Labs
Using Knative and Ambassador Edge Stack to Handle Traffic
While Kubernetes is an impressive container orchestration and management tool, it can be quite complex to use and doesn’t quite cut it when…
7 min read
·
Aug 31, 2022
253
Ambassador Edge Stack vs Traefik vs NGINX
Ejiro Onose
Ejiro Onose
in
Ambassador Labs
Ambassador Edge Stack vs Traefik vs NGINX
Ambassador Edge Stack, Traefik, and NGINX are popular tools for implementing an API gateway and load balancer in a Kubernetes environment.
7 min read
·
Jan 26, 2023
273
See all from Ejiro Onose
Recommended from Medium
Comparing RAG Part 2: Vector Stores; FAISS vs Chroma
Stepkurniawan
Stepkurniawan
Comparing RAG Part 2: Vector Stores; FAISS vs Chroma
In this study, we examine the impact of two vector stores, FAISS (https://faiss.ai) and Chroma, on the retrieved context to assess their…
4 min read
·
Jan 1, 2024
29
Comparing Vector Databases
Adam Blum
Adam Blum
Comparing Vector Databases
If you are a developer interested in AI, there is a good chance that you may have started working with Large Language Models such as…
10 min read
·
Nov 9, 2023
211
7
Lists
Natural Language Processing
1380 stories
·
873 saves
ChatGPT prompts
47 stories
·
1439 saves
Sparse embedding or BM25?
InfiniFlow
InfiniFlow
Sparse embedding or BM25?
Since the open-sourcing of Infinity, it has received a wide positive response from the community. Regarding the essential RAG technology we…
7 min read
·
Feb 13, 2024
16
Experimenting with Vector Databases: Chromadb, Pinecone, Weaviate and Pgvector
Vishnu Sivan
Vishnu Sivan
in
CoinsBench
Experimenting with Vector Databases: Chromadb, Pinecone, Weaviate and Pgvector
Vector databases are specialized systems designed for storing, managing, and searching embedding vectors. The widespread adoption of…
10 min read
·
Nov 14, 2023
164
1
🚀 Blazing Fast Text Embeddings Inference for your RAG
David Min
David Min
🚀 Blazing Fast Text Embeddings Inference for your RAG
Hugging Face — Text Embeddings Inference (TEI)
6 min read
·
Oct 21, 2023
94
2
Vector Database — Introduction and Python Implementation
Denaya
Denaya
Vector Database — Introduction and Python Implementation
Have you heard about vector databases? They’re everywhere these days, and it’s not just tech enthusiasts buzzing about them. But why the…
7 min read
·
Jan 17, 2024
89
See more recommendations
Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams