Langchain对话检索链中聊天历史与内存的深度解析与实践

碧海醫心

发布时间：2025-10-15 09:34:20

204人浏览过

来源于php中文网

原创

Langchain对话检索链中聊天历史与内存的深度解析与实践

本文深入探讨了langchain中`conversationalretrievalchain`在配置提示模板和内存时，为何仍需显式传入`chat_history`的常见疑问。通过详细解析内存管理、提示模板构建及`get_chat_history`参数的作用，提供了一套完整的解决方案，旨在帮助开发者有效构建具备上下文感知能力的对话式检索应用。

在构建基于Langchain的对话式检索应用时，开发者常会遇到一个问题：即使已经为ConversationalRetrievalChain配置了内存（Memory），在调用链时仍然收到ValueError: Missing some input keys: {'chat_history'}.的错误。这通常是因为对chat_history在链中扮演的角色以及其与内存和提示模板的交互机制存在误解。本文将详细阐述这一机制，并提供一个完整的实现方案。

核心概念解析

要理解为何需要显式传入chat_history，我们首先要明确几个关键组件的作用：

ConversationalRetrievalChain: 这是一个专门用于结合对话历史和文档检索来回答用户问题的链。它内部通常包含一个用于压缩历史问题的链（可选）、一个检索器（Retriever）以及一个用于结合检索结果和历史来生成答案的链。
内存（Memory）: 如ConversationBufferMemory，它负责存储和管理整个对话的轮次，以便后续的对话能够获取历史上下文。memory_key参数指定了内存内容在传递给其他组件时所使用的键名。
提示模板（Prompt Template）: 定义了大型语言模型（LLM）接收输入时的结构。一个典型的对话式检索提示模板会包含占位符，如{context}（检索到的相关文档）、{chat_history}（对话历史）和{question}（当前用户问题）。
get_chat_history 参数: 这是ConversationalRetrievalChain的一个关键参数。它是一个函数，用于指定如何从链的输入字典中提取chat_history变量，以满足提示模板中{chat_history}占位符的需求。

问题的核心在于，尽管ConversationBufferMemory内部维护了对话历史，但ConversationalRetrievalChain在执行其内部的combine_docs_chain时，如果该链所使用的提示模板（例如，通过combine_docs_chain_kwargs={"prompt": qa_prompt}传入）明确要求{chat_history}作为一个输入变量，那么链就必须从其接收的输入字典中获取这个chat_history。get_chat_history参数正是为了告诉链如何完成这个提取过程。

环境准备：构建检索索引

在构建对话检索链之前，我们需要一个可供检索的知识库。这里以FAISS作为向量存储，并使用VertexAIEmbeddings进行文本嵌入。

import os
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import VertexAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter, Language

# 配置嵌入模型
EMBEDDING_QPM = 100
EMBEDDING_NUM_BATCH = 5
embeddings = VertexAIEmbeddings(
    requests_per_minute=EMBEDDING_QPM,
    num_instances_per_batch=EMBEDDING_NUM_BATCH,
    model_name="textembedding-gecko",
    max_output_tokens=512,
    temperature=0.1,
    top_p=0.8,
    top_k=40
)

# 文本分割器
text_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.PYTHON, chunk_size=2000, chunk_overlap=500
)

# 加载训练数据并创建文档
docs = []
training_data_path = "training/facts/" # 假设训练数据文件在此目录
trainingData = os.listdir(training_data_path)

for training_file in trainingData:
    with open(os.path.join(training_data_path, training_file), 'r', encoding='utf-8') as f:
        print(f"Add {f.name} to dataset")
        texts = text_splitter.create_documents([f.read()])
        docs.extend(texts)

# 从文档创建FAISS向量存储并保存到本地
store = FAISS.from_documents(docs, embeddings)
store.save_local("faiss_index")
print("FAISS index created and saved.")

构建对话检索链

接下来，我们将逐步构建ConversationalRetrievalChain，重点关注内存、提示模板和chat_history的处理。

from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain_community.llms import VertexAI # 假设使用VertexAI作为LLM
from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import VertexAIEmbeddings

# 假设LLM和embeddings已经初始化
# code_llm = VertexAI(...) # 初始化你的LLM
# embeddings = VertexAIEmbeddings(...) # 初始化你的embeddings

# 1. 加载FAISS索引并创建检索器
# 确保faiss_index目录和embeddings模型与创建索引时一致
store = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True) # 注意：如果索引来自不可信来源，此参数需谨慎
retriever = store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2},
)

# 2. 初始化对话内存
# memory_key='chat_history' 是关键，它定义了内存内容在提示中被引用的变量名
memory = ConversationBufferMemory(
    memory_key='chat_history', return_messages=True, output_key='answer'
)

# 3. 定义自定义提示模板
# 注意：提示模板中必须包含 {context}, {chat_history}, {question} 占位符
promptTemplate = """请根据提供的上下文和聊天历史回答用户问题。

上下文:
{context}

聊天历史:
{chat_history}

用户问题:
{question}
"""

messages = [
    SystemMessagePromptTemplate.from_template(promptTemplate),
    HumanMessagePromptTemplate.from_template("{question}") # 这里的{question}是实际的用户输入
]
qa_prompt = ChatPromptTemplate.from_messages(messages)

# 4. 初始化LLM
code_llm = VertexAI(
    model_name="gemini-pro", # 或者其他适合你的模型
    max_output_tokens=512,
    temperature=0.1,
    top_p=0.8,
    top_k=40
)

# 5. 构建ConversationalRetrievalChain
# get_chat_history=lambda h : h 是核心，它告诉链从输入字典中直接获取 'chat_history'
# combine_docs_chain_kwargs={"prompt": qa_prompt} 将我们自定义的提示模板注入到文档组合链中
qa_chain = ConversationalRetrievalChain.from_llm(
    llm=code_llm,
    retriever=retriever,
    memory=memory,
    get_chat_history=lambda h: h,
    combine_docs_chain_kwargs={"prompt": qa_prompt}
)

# 6. 维护外部聊天历史并调用链
# 外部维护的history列表用于满足 get_chat_history 的要求
history = []

def chat_with_bot(question: str):
    global history # 声明使用全局的history列表

    # 调用链时，显式传入 'question' 和 'chat_history'
    # 'chat_history' 会通过 get_chat_history 传递给提示模板
    # 同时，ConversationBufferMemory 也会利用这些信息更新其内部状态
    response = qa_chain({"question": question, "chat_history": history})

    answer = response['answer']
    # 更新外部历史列表，用于下一次调用
    history.append((question, answer))
    return answer

# 示例对话
print(chat_with_bot("什么是FAISS？"))
print(chat_with_bot("它有什么作用？"))
print(chat_with_bot("如何使用Python加载它？"))

注意事项与最佳实践

chat_history 的双重角色:

Contentfries
将长视频改造成更加引人注目的短视频

下载
- 作为链的输入: 当你的提示模板明确要求{chat_history}时，ConversationalRetrievalChain需要从其输入字典中获取这个变量。get_chat_history=lambda h: h指定了从输入字典中键为'chat_history'的值作为聊天历史。因此，你需要在每次调用链时，显式地将一个包含历史对话的列表作为chat_history传入。
- 作为内存管理的一部分: ConversationBufferMemory通过memory_key='chat_history'来管理和格式化聊天历史。它会在链的内部处理过程中，将历史记录注入到提示模板中。这里的memory_key需要与提示模板中引用的历史变量名一致。理解这一点至关重要：get_chat_history处理的是链的外部输入到提示模板的映射，而memory处理的是链的内部状态管理和历史的格式化。
提示模板中的占位符匹配: 确保你的自定义提示模板中包含{context}、{chat_history}和{question}这些占位符，并且这些占位符的名称与链的内部期望以及内存的memory_key相匹配。
外部历史记录的维护: 在本示例中，我们维护了一个外部的history列表。每次对话结束后，我们都会将最新的问答对追加到这个列表中，以便在下一次调用链时传入完整的历史。这是满足get_chat_history=lambda h: h要求所必需的。
allow_dangerous_deserialization=True: 在加载FAISS索引时，如果索引是本地文件并且你信任其来源，可以使用allow_dangerous_deserialization=True。但在生产环境中或处理来自未知来源的索引时，请务必谨慎，并考虑更安全的加载方式。

总结

ConversationalRetrievalChain是一个功能强大的工具，但其配置，尤其是在处理对话历史时，需要对Langchain的内部机制有清晰的理解。通过正确配置ConversationBufferMemory的memory_key，自定义包含{chat_history}的提示模板，并关键性地设置get_chat_history=lambda h: h参数，同时在每次调用链时显式传入一个外部维护的chat_history列表，我们可以有效地解决ValueError: Missing some input keys: {'chat_history'}.的问题，并成功构建一个具备上下文感知能力的对话式检索系统。理解这些组件如何协同工作，是构建健壮和智能对话应用的关键。

Python文件编码异常处理_跨平台解析【教程】

Python对象比较规则_eq方法说明【指导】

Python全栈项目开发进阶教程_FrontendBackend完整项目

Python持续集成进阶教程_GitHubActions与Jenkins实践

Python类属性与方法访问_作用范围说明【指导】