Retrieval

The Retrieval runtime of our LLM Module provides an interface to retrieve relevant context from different vector databases, starting with PGVector and Qdrant. This runtime implements a client for each database allowing you to query them given an embedding vector. As with the other LLM module runtimes, it is designed to easily integrate within a Seldon Core 2 pipeline to enable you to construct complex LLM applications, in this case applications requiring Retrieval Augmented Generation (RAG) functionality.

A typical example of an RAG pipeline is the following:

One feature of vector databases is the ability to add metadata filtering to restrict the search space for similar documents. For the Qdrant database you can you can check the filtering documentation here. For the PGVector database, we implemented our filtering schema. We defined the following comparison operators: eq - equals, new - not equal, gt - greater than, lt - less than, ge - greater equal, le - less equal, like - like, in - in. Those operators can be used to create predicates, and formulas can be constructed by connecting predicates via connectivity operators such as and or or. Here are some examples of filtering formulas:

  • id < 3 AND id > 0 is equivalent to

{
    "id": {"lt": 3, "gt": 0}
}
  • id IN (1, 2, 3) is equivalent to

{
    "id": {"in": [1, 2, 3]}
}
  • id <= 3 AND category = "geography" is equivalent to

{
    "id": {"le": 3}, 
    "category": {"eq": "geography"}
}

Note that you can omit the and operator.

  • id = 1 is equivalent to

{"id": 1}
  • id IN (1, 2, 3) is equivalent to

{
    "id": [1, 2, 3]
}
  • name IS NULL is equivalent to

{"name": None}
  • id = 1 AND category IN ("geography", "history") AND name IS NULL is equivalent to

{
    "id": 1, 
    "category": ["geography", "history"], 
    "name": None
}
  • (name = "London" OR category = "geography") AND id IN (1, 2, 3) is equivalent to

{
    "OR": [
        {"name": "London"}, 
        {"category": "geography"}
    ], 
    "id": [1, 2, 3]
}

See the example here on how to use the VectorDB runtime.

Last updated

Was this helpful?