无耳 Solon（OpenSolon） | rag

讲到大模型与外部结合，Tool Call（或 Function Call）是一种。再有就是 RAG（Retrieval-Augmented Generation）。

Tool Call，可以让大模型在生成时，“按需”调用外部的函数工具
- 用于构建互动系统
RAG，则在提交大模型之前，检索数据并增强用户消息（或提示语）。让大模型在生成内容时，有上下文可参考。
- 提高生成内容的准备性和相关性

RAG 是一种结合检索技术（Retrieval）与生成式人工智能（Generative AI）的框架，旨在利用外部知识增强生成模型的回答准确性和上下文相关性。它适用于需要高准确性、领域知识支持或动态信息的应用场景。

1、两种检索方案

常规检索

比如查询 Rdb、Es、MongoDb 等，直接把数据，通过字符串格式化做为提示语的上下文内容。

//上下文只要能转为字符串就行
ChatMessage.ofUserAugment(message, context);

矢量相关性检索

需要借助 EmbeddingModel，把内容生成矢量数据。并使用矢量存储，比如 Milvus、VectoRex、RedisSearch 等

2、检索的统一接口：知识库

solon-ai 使用“知识库”作为检索的统一接口（Repository）。知识库分为：

接口	描述
Repository	知识库（只检索）
RepositoryStorable	可存储知识库（可检索、可存储）

“常规检索” 和 “矢量相关性检索” 两种方案，都可以包装成知识库（从而统一接口体验）。

3、solon-ai rag 的四个技术概念与整合效果

嵌入模型（EmbeddingModel）

嵌入模型的作用，是协助 “知识库” 将文档内容转为嵌入矢量数据。

重排模型（RerankingModel）

重排模型的作用，是 “知识库” 搜出结果后，进一步优化排序。

知识库相关

接口（或概念）	描述
Document	文档（或知识）
DocumentLoader	文档加载器（为文档加载提供接口定义。也可以自己随便加载）
DocumentSplitter	文档分割器（为文档分割提供接口定义。也可以自己随便分割）

Repository	知识库（检索文档）
RepositoryStorable	知识库（检索文档 + 存储文档）

聊天模型（ChatModel）

以上内容，就是为了提高聊天模型的回答准确性和上下文相关性。

//构建 embeddingModel
EmbeddingModel  embeddingModel = EmbeddingModel.of(embedding_apiUrl)
        .apiKey(embedding_apiKey)
        .model(embedding_model)
        .build();
        
//构建 rerankingModel（可选）
RerankingModel  rerankingModel = RerankingModel.of(reranking_apiUrl)
        .apiKey(reranking_apiKey)
        .model(reranking_model)
        .build();
        

//构建 repository （联网搜索的知识库）
WebSearchRepository repository = WebSearchRepository.of(websearch_apiUrl)
        .apiKey(websearch_apkKey)
        .embeddingModel(embeddingModel)
        .build();

//构建 chatModel
ChatModel chatModel = ChatModel.of(chat_apiUrl)
        .apiKey(chat_apkKey)
        .model(chat_model)
        .build();

//应用示例
public void init() throws Exception { //初始化知识库
      repository.insert(Arrays.asList(new Document("demo")));
}

public void query(String message) throws Exception { //查询
      //知识库检索
      List<Document> context = repository.search(message);
      
      //重排（可选）
      context = rerankingModel.rerank(message, context);

      //大模型交互
      ChatResponse resp = chatModel
                      .prompt(ChatMessage.ofUserAugment(message, context)) //提示语增强（内部为字符串格式化）
                      .call();
}

Solon v3.5.0

rag - RAG 相关概念

1、两种检索方案

2、检索的统一接口：知识库

3、solon-ai rag 的四个技术概念与整合效果