diff --git a/docs/ai-integration/vector-search/assets/json-document-nodejs.png b/docs/ai-integration/vector-search/assets/json-document-nodejs.png new file mode 100644 index 0000000000..3ae5cf3146 Binary files /dev/null and b/docs/ai-integration/vector-search/assets/json-document-nodejs.png differ diff --git a/docs/ai-integration/vector-search/content/_data-types-for-vector-search-csharp.mdx b/docs/ai-integration/vector-search/content/_data-types-for-vector-search-csharp.mdx index 6334e4665e..dedd319a35 100644 --- a/docs/ai-integration/vector-search/content/_data-types-for-vector-search-csharp.mdx +++ b/docs/ai-integration/vector-search/content/_data-types-for-vector-search-csharp.mdx @@ -2,6 +2,8 @@ import Admonition from '@theme/Admonition'; import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import CodeBlock from '@theme/CodeBlock'; +import ContentFrame from '@site/src/components/ContentFrame'; +import Panel from '@site/src/components/Panel'; @@ -19,7 +21,7 @@ import CodeBlock from '@theme/CodeBlock'; -## Supported data types for vector search + ### Textual data @@ -83,7 +85,9 @@ you can also use lists (for example, `List` or `List`) for dynam -## RavenVector + + + RavenVector is RavenDB's dedicated data type for storing and querying **numerical embeddings**. It is highly optimized to minimize storage space and improve the speed of reading arrays from disk, @@ -126,3 +130,5 @@ For example: ![json document](../assets/json-document.png) + + \ No newline at end of file diff --git a/docs/ai-integration/vector-search/content/_data-types-for-vector-search-nodejs.mdx b/docs/ai-integration/vector-search/content/_data-types-for-vector-search-nodejs.mdx new file mode 100644 index 0000000000..af637aa673 --- /dev/null +++ b/docs/ai-integration/vector-search/content/_data-types-for-vector-search-nodejs.mdx @@ -0,0 +1,156 @@ +import Admonition from '@theme/Admonition'; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +import CodeBlock from '@theme/CodeBlock'; +import ContentFrame from '@site/src/components/ContentFrame'; +import Panel from '@site/src/components/Panel'; + + + +* Data for vector search can be stored in **raw** or **pre-quantized** formats using several data types, + as outlined below. + +* Text and numerical data that is not pre-quantized can be further quantized in the generated embeddings. + Learn more in [Quantization options](../../../ai-integration/vector-search/vector-search-using-dynamic-query.mdx#quantization-options). + +* In this article: + * [Supported data types for vector search](../../../ai-integration/vector-search/data-types-for-vector-search.mdx#supported-data-types-for-vector-search) + * [Textual data](../../../ai-integration/vector-search/data-types-for-vector-search.mdx#textual-data) + * [Numerical data](../../../ai-integration/vector-search/data-types-for-vector-search.mdx#numerical-data) + * [RavenVector](../../../ai-integration/vector-search/data-types-for-vector-search.mdx#ravenvector) + + + + + +### Textual data + + + +`string` - A single text entry. +`string[]` - An array of text entries. + + + +### Numerical data + +* You can store **pre-generated** embedding vectors in your documents, + typically created by machine-learning models from text, images, or other sources. + +* When storing numerical embeddings in a document field: + * Ensure that all vectors within this field across all documents in the collection are generated by the **same model** and model version and have the **same dimensions**. + * Consistency in both dimensionality and model source is crucial for meaningful comparisons in the vector space. + +* In addition to the native types described below, we highly recommended using [RavenVector](../../../ai-integration/vector-search/data-types-for-vector-search.mdx#ravenvector) + for efficient storage and fast queries when working with numerical embeddings. + + + +**Raw embedding data**: +Use when precision is critical. + +`number[]` - A single vector of numerical values representing raw embedding data. +`number[][]`- An array of vectors, where each entry is a separate embedding vector. + + + + + +**Pre-quantized data**: +Use when you prioritize storage efficiency and query speed. + +`number[]` - A single pre-quantized embedding vector in the _Int8_ or _Binary_ quantization format. +`number[][]` - An array of pre-quantized embedding vectors. + +When storing data in these formats in your documents, you should use [RavenDB’s vector quantizer methods](../../../ai-integration/vector-search/vector-search-using-dynamic-query.mdx#section-1). + + + + + +**Base64-encoded data**: +Use when embedding data needs to be represented as a compact and easily serializable string format. + +`string` - A single vector encoded as a Base64 string. +`string[]` - An array of Base64-encoded vectors. + + + + + + + +* `RavenVector` is a helper function that wraps a numerical array into a dedicated `@vector` object in RavenDB. + This structure is highly efficient - it minimizes disk space usage and improves read performance, + making it ideal for both storing and querying embeddings. + +* `RavenVector` is purely structural - it does not apply any transformation or validation to the vector, + it simply ensures that the vector is stored under the `@vector` field in the JSON document. + All vector processing and comparisons are handled entirely by the server. + +--- + +#### Example: storing a vector using RavenVector + + + +```js +const session = documentStore.openSession(); + +const user = new User(); + +// Store embedding in a 'RavenVector' format +user.EmbeddingRavenVector = RavenVector([ + 6.599999904632568, + 7.699999809265137 + ]); + +// Store embedding in a raw numerical array +// This won’t take advantage of RavenDB’s optimized storage format +user.EmbeddingVector = [ + 6.599999904632568, + 7.699999809265137 + ]; + +await session.store(user, "users/1"); +await session.saveChanges(); +``` + + +```js +class User { + constructor(embeddingRavenVector, embeddingVector) { + this.embeddingRavenVector = embeddingRavenVector; + this.embeddingVector = embeddingVector; + } +} +``` + + + +When stored, the vector's content created with the `RavenVector` function will appear under the `@vector` field in the JSON document: + +![json document](../assets/json-document-nodejs.png) + +To query the `embeddingRavenVector field`, use: + + +```js +const similarUsers = await session.query({ collection: "Users" }) + .vectorSearch( + field => field.withEmbedding("embeddingRavenVector"), + + queryVector => queryVector.byEmbedding( + // Use 'RavenVector' to wrap the query vector + RavenVector([ + 6.599999904632568, + 7.699999809265137 + ]) + ) + ) + .all(); +``` + + + + diff --git a/docs/ai-integration/vector-search/content/_vector-search-using-dynamic-query-nodejs.mdx b/docs/ai-integration/vector-search/content/_vector-search-using-dynamic-query-nodejs.mdx index f2357f7b32..e8f1df0d56 100644 --- a/docs/ai-integration/vector-search/content/_vector-search-using-dynamic-query-nodejs.mdx +++ b/docs/ai-integration/vector-search/content/_vector-search-using-dynamic-query-nodejs.mdx @@ -1247,7 +1247,7 @@ factory.byBase64(base64Embedding); #### `RavenVector`: -RavenVector is RavenDB’s dedicated representation for storing and querying numerical embeddings. +RavenVector is RavenDB’s dedicated representation for storing and querying numerical embeddings. Learn more in [RavenVector](../../../ai-integration/vector-search/data-types-for-vector-search.mdx#ravenvector). @@ -1255,7 +1255,7 @@ Learn more in [RavenVector](../../../ai-integration/vector-search/data-types-for // representation of a RavenVector: { "@vector": number[] } -// Helper to create this wrapper: +// Helper function to create this wrapper: RavenVector(numberArray) // => { "@vector": numberArray } ``` diff --git a/docs/ai-integration/vector-search/data-types-for-vector-search.mdx b/docs/ai-integration/vector-search/data-types-for-vector-search.mdx index c8023243b6..f54d6feba7 100644 --- a/docs/ai-integration/vector-search/data-types-for-vector-search.mdx +++ b/docs/ai-integration/vector-search/data-types-for-vector-search.mdx @@ -8,8 +8,9 @@ import LanguageSwitcher from "@site/src/components/LanguageSwitcher"; import LanguageContent from "@site/src/components/LanguageContent"; import DataTypesForVectorSearchCsharp from './content/_data-types-for-vector-search-csharp.mdx'; +import DataTypesForVectorSearchNodejs from './content/_data-types-for-vector-search-nodejs.mdx'; -export const supportedLanguages = ["csharp"]; +export const supportedLanguages = ["csharp", "nodejs"]; @@ -17,6 +18,9 @@ export const supportedLanguages = ["csharp"]; + + +