Skip to main content

How use a vector store to retrieve data

A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

Retrievers accept a string query as input and return a list of Document's as output.

Advanced Retrieval Types

LangChain provides several advanced retrieval types. A full list is below, along with the following information:

Name: Name of the retrieval algorithm.

Index Type: Which index type (if any) this relies on.

Uses an LLM: Whether this retrieval method uses an LLM.

When to Use: Our commentary on when you should considering using this retrieval method.

Description: Description of what this retrieval algorithm is doing.

NameIndex TypeUses an LLMWhen to UseDescription
VectorstoreVectorstoreNoIf you are just getting started and looking for something quick and easy.This is the simplest method and the one that is easiest to get started with. It involves creating embeddings for each piece of text.
ParentDocumentVectorstore + Document StoreNoIf your pages have lots of smaller pieces of distinct information that are best indexed by themselves, but best retrieved all together.This involves indexing multiple chunks for each document. Then you find the chunks that are most similar in embedding space, but you retrieve the whole parent document and return that (rather than individual chunks).
Multi VectorVectorstore + Document StoreSometimes during indexingIf you are able to extract information from documents that you think is more relevant to index than the text itself.This involves creating multiple vectors for each document. Each vector could be created in a myriad of ways - examples include summaries of the text and hypothetical questions.
Self QueryVectorstoreYesIf users are asking questions that are better answered by fetching documents based on metadata rather than similarity with the text.This uses an LLM to transform user input into two things: (1) a string to look up semantically, (2) a metadata filter to go along with it. This is useful because oftentimes questions are about the METADATA of documents (not the content itself).
Contextual CompressionAnySometimesIf you are finding that your retrieved documents contain too much irrelevant information and are distracting the LLM.This puts a post-processing step on top of another retriever and extracts only the most relevant information from retrieved documents. This can be done with embeddings or an LLM.
Time-Weighted VectorstoreVectorstoreNoIf you have timestamps associated with your documents, and you want to retrieve the most recent onesThis fetches documents based on a combination of semantic similarity (as in normal vector retrieval) and recency (looking at timestamps of indexed documents)
Multi-Query RetrieverAnyYesIf users are asking questions that are complex and require multiple pieces of distinct information to respondThis uses an LLM to generate multiple queries from the original one. This is useful when the original query needs pieces of information about multiple topics to be properly answered. By generating multiple queries, we can then fetch documents for each of them.

Third Party Integrations

LangChain also integrates with many third-party retrieval services. For a full list of these, check out this list of all integrations.

Get started

The public API of the BaseRetriever class in LangChain.js is as follows:

export abstract class BaseRetriever {
abstract getRelevantDocuments(query: string): Promise<Document[]>;
}

It's that simple! You can call getRelevantDocuments to retrieve documents relevant to a query, where "relevance" is defined by the specific retriever object you are calling.

Of course, we also help construct what we think useful Retrievers are. The main type of Retriever in LangChain is a vector store retriever. We will focus on that here.

Note: Before reading, it's important to understand what a vector store is.

This example showcases question answering over documents. We have chosen this as the example for getting started because it nicely combines a lot of different elements (Text splitters, embeddings, vectorstores) and then also shows how to use them in a chain.

Question answering over documents consists of four steps:

  1. Create an index
  2. Create a Retriever from that index
  3. Create a question answering chain
  4. Ask questions!

Each of the steps has multiple sub steps and potential configurations, but we'll go through one common flow using HNSWLib, a local vector store. This assumes you're using Node, but you can swap in another integration if necessary.

First, install the required dependency:

npm install @langchain/openai hnswlib-node @langchain/community

You can download the state_of_the_union.txt file here.

import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import * as fs from "fs";
import { formatDocumentsAsString } from "langchain/util/document";
import {
RunnablePassthrough,
RunnableSequence,
} from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";
import {
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
} from "@langchain/core/prompts";

// Initialize the LLM to use to answer the question.
const model = new ChatOpenAI({});
const text = fs.readFileSync("state_of_the_union.txt", "utf8");
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
const docs = await textSplitter.createDocuments([text]);
// Create a vector store from the documents.
const vectorStore = await HNSWLib.fromDocuments(docs, new OpenAIEmbeddings());

// Initialize a retriever wrapper around the vector store
const vectorStoreRetriever = vectorStore.asRetriever();

// Create a system & human prompt for the chat model
const SYSTEM_TEMPLATE = `Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
{context}`;
const messages = [
SystemMessagePromptTemplate.fromTemplate(SYSTEM_TEMPLATE),
HumanMessagePromptTemplate.fromTemplate("{question}"),
];
const prompt = ChatPromptTemplate.fromMessages(messages);

const chain = RunnableSequence.from([
{
context: vectorStoreRetriever.pipe(formatDocumentsAsString),
question: new RunnablePassthrough(),
},
prompt,
model,
new StringOutputParser(),
]);

const answer = await chain.invoke(
"What did the president say about Justice Breyer?"
);

console.log({ answer });

/*
{
answer: 'The president thanked Justice Stephen Breyer for his service and honored him for his dedication to the country.'
}
*/

API Reference:

Let's walk through what's happening here.

  1. We first load a long text and split it into smaller documents using a text splitter. We then load those documents (which also embeds the documents using the passed OpenAIEmbeddings instance) into HNSWLib, our vector store, creating our index.

  2. Though we can query the vector store directly, we convert the vector store into a retriever to return retrieved documents in the right format for the question answering chain.

  3. We initialize a retrieval chain, which we'll call later in step 4.

  4. We ask questions!

See the individual sections for deeper dives on specific retrievers, or this section to learn how to create your own custom retriever over any data source.


Help us out by providing feedback on this documentation page: