# Retrieval

The Retrieval runtime of our LLM Module provides an interface to retrieve relevant context from different vector databases, starting with PGVector and Qdrant. This runtime implements a client for each database allowing you to query them given an embedding vector. As with the other LLM module runtimes, it is designed to easily integrate within a [Seldon Core 2 pipeline](https://docs.seldon.io/projects/seldon-core/en/v2/contents/pipelines/index.html) to enable you to construct complex LLM applications, in this case applications requiring Retrieval Augmented Generation (RAG) functionality.

A typical example of an RAG pipeline is the following:

{% @mermaid/diagram content="flowchart LR
user\_input(\[User Input])
llm\_output(\[LLM Output])
document\_store\[(Document Store)]
embedding\_model\["Embedding Model"]
vector\_db\["Vector Database Client"]
llm\["LLM"]

```
user_input --> embedding_model --> vector_db
vector_db --> llm
llm --> llm_output
vector_db --> document_store

%% Styling for modules
style embedding_model fill:#407,stroke:#333,stroke-width:2px,color:#fff
style vector_db fill:#407,stroke:#333,stroke-width:2px,color:#fff
style llm fill:#407,stroke:#333,stroke-width:2px,color:#fff" %}
```

One feature of vector databases is the ability to add metadata filtering to restrict the search space for similar documents. For the Qdrant database you can you can check the filtering documentation [here](https://qdrant.tech/documentation/concepts/filtering/). For the PGVector database, we implemented our filtering schema. We defined the following comparison operators: `eq` - equals, `new` - not equal, `gt` - greater than, `lt` - less than, `ge` - greater equal, `le` - less equal, `like` - like, `in` - in. Those operators can be used to create predicates, and formulas can be constructed by connecting predicates via connectivity operators such as `and` or `or`. Here are some examples of filtering formulas:

* `id < 3 AND id > 0` is equivalent to

```json
{
    "id": {"lt": 3, "gt": 0}
}
```

* `id IN (1, 2, 3)` is equivalent to

```json
{
    "id": {"in": [1, 2, 3]}
}
```

* `id <= 3 AND category = "geography"` is equivalent to

```json
{
    "id": {"le": 3}, 
    "category": {"eq": "geography"}
}
```

Note that you can omit the `and` operator.

* `id = 1` is equivalent to

```json
{"id": 1}
```

* `id IN (1, 2, 3)` is equivalent to

```json
{
    "id": [1, 2, 3]
}
```

* `name IS NULL` is equivalent to

```json
{"name": None}
```

* `id = 1 AND category IN ("geography", "history") AND name IS NULL` is equivalent to

```json
{
    "id": 1, 
    "category": ["geography", "history"], 
    "name": None
}
```

* `(name = "London" OR category = "geography") AND id IN (1, 2, 3)` is equivalent to

```json
{
    "OR": [
        {"name": "London"}, 
        {"category": "geography"}
    ], 
    "id": [1, 2, 3]
}
```

See the example [here](https://github.com/SeldonIO/llm-runtimes/blob/master/docs-gb/examples/retrieval/README.md) on how to use the VectorDB runtime.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/llm-module/components/retrieval.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
