# Getting started

The LLM Module is a package built to solve the biggest challenges in deploying LLMs and other Gen AI models, building applications with them, and managing those applications over time. With the LLM Module, you will be able to...

* **Easily deploy** models locally, with a choice of 3 serving backends optimized for LLM and other foundation models. We also offer integration with 3rd party services (OpenAI, to start) as a 'hosted' alternative.
* **Build Applications** with those deployments. We offer an out-of-the-box memory component to store chat history within an application, support for prompt templates and templating tools, and support for custom components, all plug-and-play within Core 2 pipelines.
* **Leverage the rest of Seldon's feature set** for model management, logging, monitoring, access management and more!

![llm-components.png](https://1351131837-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FcsWxN0Xouw6OXkKNUoAe%2Fuploads%2Fgit-blob-043edbaa150f5711a5c7b4a815dd52da031c6480%2Fllm-components.png?alt=media)

## Runtimes

The Seldon LLM module provides four components to support your LLM deployment, application building, and monitoring needs. Each component is built to support different parts of the LLM application building landscape, from the deployment of the model first, to the implementation of common AI application design patterns around that deployment like retrieval augmented generation and memory. Since the components are implemented as MLServer runtimes, they each use a unique `model-settings.json` configuration file and have their own inference request-responses. The components offered are:

| Runtime                                                                      | Description                                                                                                                                                              |
| ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| [API](https://docs.seldon.ai/llm-module/components/models/api)               | Access hosted models, like OpenAI and Gemini                                                                                                                             |
| [Local](https://docs.seldon.ai/llm-module/components/models/local)           | Deploy LLM and GenAI models locally, leveraging [performance optimizations](https://docs.seldon.ai/llm-module/components/models/local#backends) for LLM and GenAI models |
| [Prompts](https://docs.seldon.ai/llm-module/components/prompting)            | Set up modular and LLM-agnostic prompts that allow for the re-use of an LLM across different use-cases                                                                   |
| [Conversational Memory](https://docs.seldon.ai/llm-module/components/memory) | Store and retrieve conversation history as part of an LLM application                                                                                                    |
| [Retrieval](https://docs.seldon.ai/llm-module/components/retrieval)          | Retrieve relevant context from a vector database given an embedding vector                                                                                               |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.seldon.ai/llm-module/introduction.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
