获取最新的快照版本，请使用 Spring AI 1.1.3！spring-doc.cadn.net.cn

聊天客户端API

The ChatClient 提供了一个流畅的 API，用于与 AI 模型进行通信。它同时支持同步和流式编程模型。spring-doc.cadn.net.cn

请参阅本文档底部的Implementation Notes,其中涉及在ChatClient中结合使用命令式和响应式编程模型的相关说明。spring-doc.cadn.net.cn

流畅的 API 提供了用于构建作为输入传递给 AI 模型的 Prompt 的各个组成部分的方法。 Prompt 包含用于指导 AI 模型输出和行为的指令文本。从 API 的角度来看，提示由一系列消息组成。spring-doc.cadn.net.cn

AI 模型处理两种主要类型的消息：用户消息，即来自用户的直接输入；以及系统消息，由系统生成以引导对话。spring-doc.cadn.net.cn

这些消息通常包含占位符，这些占位符会在运行时根据用户输入进行替换，以自定义 AI 模型对用户输入的响应。spring-doc.cadn.net.cn

此外，还可以指定一些提示选项，例如要使用的 AI 模型名称以及控制生成输出随机性或创造性的温度设置。spring-doc.cadn.net.cn

创建聊天客户端

ChatClient 是使用 ChatClient.Builder 对象创建的。您可以为任何 ChatModel Spring Boot 自动配置获取一个自动配置的 ChatClient.Builder 实例，或以编程方式创建一个。spring-doc.cadn.net.cn

使用自动配置的 ChatClient.Builder

在最简单的使用场景中，Spring AI 提供了 Spring Boot 自动配置，为您创建一个原型 ChatClient.Builder Bean，以便您将其注入到类中。以下是一个简单的示例，用于检索对简单用户请求的 String 响应。spring-doc.cadn.net.cn

@RestController
class MyController {

    private final ChatClient chatClient;

    public MyController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping("/ai")
    String generation(String userInput) {
        return this.chatClient.prompt()
            .user(userInput)
            .call()
            .content();
    }
}

在这个简单示例中，用户输入设置了用户消息的内容。 call() 方法向 AI 模型发送请求，而 content() 方法将 AI 模型的响应作为 String 返回。spring-doc.cadn.net.cn

使用多个聊天模型

在单个应用程序中，您可能需要使用多个聊天模型的几种场景包括：spring-doc.cadn.net.cn

使用不同的模型来处理不同类型的任务（例如，使用强大的模型进行复杂推理，而使用更快、更便宜的模型处理简单任务）。spring-doc.cadn.net.cn
在某个模型服务不可用时，实现回退机制spring-doc.cadn.net.cn
A/B 测试不同的模型或配置spring-doc.cadn.net.cn
根据用户的偏好提供模型选择spring-doc.cadn.net.cn
结合专用模型（一个用于代码生成，另一个用于创意内容等）。spring-doc.cadn.net.cn

默认情况下，Spring AI 会自动配置一个 ChatClient.Builder bean。然而，您可能需要在应用程序中使用多个聊天模型。以下是处理这种情况的方法：spring-doc.cadn.net.cn

在所有情况下，您都需要通过设置属性spring.ai.chat.client.enabled=false来禁用ChatClient.Builder自动配置。spring-doc.cadn.net.cn

这允许您手动创建多个 ChatClient 实例。spring-doc.cadn.net.cn

多个聊天客户端使用单一模型类型

本节介绍一种常见用例：您需要创建多个 ChatClient 实例，这些实例均使用相同的底层模型类型，但配置各不相同。spring-doc.cadn.net.cn

// Create ChatClient instances programmatically
ChatModel myChatModel = ... // already autoconfigured by Spring Boot
ChatClient chatClient = ChatClient.create(myChatModel);

// Or use the builder for more control
ChatClient.Builder builder = ChatClient.builder(myChatModel);
ChatClient customChatClient = builder
    .defaultSystemPrompt("You are a helpful assistant.")
    .build();

不同模型类型的聊天客户端

在使用多个AI模型时，您可以为每个模型定义单独的ChatClient bean：spring-doc.cadn.net.cn

import org.springframework.ai.chat.ChatClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ChatClientConfig {

    @Bean
    public ChatClient openAiChatClient(OpenAiChatModel chatModel) {
        return ChatClient.create(chatModel);
    }

    @Bean
    public ChatClient anthropicChatClient(AnthropicChatModel chatModel) {
        return ChatClient.create(chatModel);
    }
}

您随后可以使用 @Qualifier 注解将这些 Bean 注入到您的应用程序组件中：spring-doc.cadn.net.cn

@Configuration
public class ChatClientExample {

    @Bean
    CommandLineRunner cli(
            @Qualifier("openAiChatClient") ChatClient openAiChatClient,
            @Qualifier("anthropicChatClient") ChatClient anthropicChatClient) {

        return args -> {
            var scanner = new Scanner(System.in);
            ChatClient chat;

            // Model selection
            System.out.println("\nSelect your AI model:");
            System.out.println("1. OpenAI");
            System.out.println("2. Anthropic");
            System.out.print("Enter your choice (1 or 2): ");

            String choice = scanner.nextLine().trim();

            if (choice.equals("1")) {
                chat = openAiChatClient;
                System.out.println("Using OpenAI model");
            } else {
                chat = anthropicChatClient;
                System.out.println("Using Anthropic model");
            }

            // Use the selected chat client
            System.out.print("\nEnter your question: ");
            String input = scanner.nextLine();
            String response = chat.prompt(input).call().content();
            System.out.println("ASSISTANT: " + response);

            scanner.close();
        };
    }
}

多个 OpenAI 兼容的 API 端点

The OpenAiApi and OpenAiChatModel classes provide a mutate() method that allows you to create variations of existing instances with different properties. This is particularly useful when you need to work with multiple OpenAI-compatible APIs.spring-doc.cadn.net.cn

@Service
public class MultiModelService {

    private static final Logger logger = LoggerFactory.getLogger(MultiModelService.class);

    @Autowired
    private OpenAiChatModel baseChatModel;

    @Autowired
    private OpenAiApi baseOpenAiApi;

    public void multiClientFlow() {
        try {
            // Derive a new OpenAiApi for Groq (Llama3)
            OpenAiApi groqApi = baseOpenAiApi.mutate()
                .baseUrl("https://api.groq.com/openai")
                .apiKey(System.getenv("GROQ_API_KEY"))
                .build();

            // Derive a new OpenAiApi for OpenAI GPT-4
            OpenAiApi gpt4Api = baseOpenAiApi.mutate()
                .baseUrl("https://api.openai.com")
                .apiKey(System.getenv("OPENAI_API_KEY"))
                .build();

            // Derive a new OpenAiChatModel for Groq
            OpenAiChatModel groqModel = baseChatModel.mutate()
                .openAiApi(groqApi)
                .defaultOptions(OpenAiChatOptions.builder().model("llama3-70b-8192").temperature(0.5).build())
                .build();

            // Derive a new OpenAiChatModel for GPT-4
            OpenAiChatModel gpt4Model = baseChatModel.mutate()
                .openAiApi(gpt4Api)
                .defaultOptions(OpenAiChatOptions.builder().model("gpt-4").temperature(0.7).build())
                .build();

            // Simple prompt for both models
            String prompt = "What is the capital of France?";

            String groqResponse = ChatClient.builder(groqModel).build().prompt(prompt).call().content();
            String gpt4Response = ChatClient.builder(gpt4Model).build().prompt(prompt).call().content();

            logger.info("Groq (Llama3) response: {}", groqResponse);
            logger.info("OpenAI GPT-4 response: {}", gpt4Response);
        }
        catch (Exception e) {
            logger.error("Error in multi-client flow", e);
        }
    }
}

聊天客户端流畅API

The ChatClient fluent API allows you to create a prompt in three distinct ways using an overloaded prompt method to initiate the fluent API:spring-doc.cadn.net.cn

prompt(): 此无参方法可让您开始使用流畅式 API，从而构建提示中的用户、系统及其他部分。spring-doc.cadn.net.cn
prompt(Prompt prompt): 此方法接受一个 Prompt 参数，允许您传入使用 Prompt 的非流畅 API 创建的 Prompt 实例。spring-doc.cadn.net.cn
prompt(String content): 这是一个类似于上一个重载的便捷方法。它接受用户的文本内容。spring-doc.cadn.net.cn

聊天客户端响应

The ChatClient API offers several ways to format the response from the AI Model using the fluent API.spring-doc.cadn.net.cn

返回 ChatResponse

AI 模型的响应是一个由类型 ChatResponse 定义的丰富结构。它包含有关如何生成该响应的元数据，并且还可以包含多个响应，称为生成，每个生成都有其自身的元数据。元数据包括用于创建响应的标记数量（每个标记大约相当于 3/4 个单词）。这些信息很重要，因为托管的 AI 模型会根据每次请求使用的标记数量收费。spring-doc.cadn.net.cn

以下示例展示了在调用 call() 方法后，通过调用 chatResponse() 返回包含元数据的 ChatResponse 对象。spring-doc.cadn.net.cn

ChatResponse chatResponse = chatClient.prompt()
    .user("Tell me a joke")
    .call()
    .chatResponse();

返回实体

您通常希望返回一个从返回的String映射而来的实体类。 entity()方法提供了此功能。spring-doc.cadn.net.cn

例如，给定以下 Java 记录：spring-doc.cadn.net.cn

record ActorFilms(String actor, List<String> movies) {}

您可以使用 entity() 方法轻松将 AI 模型的输出映射到此记录，如下所示：spring-doc.cadn.net.cn

ActorFilms actorFilms = chatClient.prompt()
    .user("Generate the filmography for a random actor.")
    .call()
    .entity(ActorFilms.class);

还有一个重载的 entity 方法，其签名是 entity(ParameterizedTypeReference<T> type),允许您指定诸如泛型 List 之类的类型：spring-doc.cadn.net.cn

List<ActorFilms> actorFilms = chatClient.prompt()
    .user("Generate the filmography of 5 movies for Tom Hanks and Bill Murray.")
    .call()
    .entity(new ParameterizedTypeReference<List<ActorFilms>>() {});

流式响应

The stream() method lets you get an asynchronous response as shown below:spring-doc.cadn.net.cn

Flux<String> output = chatClient.prompt()
    .user("Tell me a joke")
    .stream()
    .content();

您也可以使用方法 Flux<ChatResponse> chatResponse() 来流式传输 ChatResponse。spring-doc.cadn.net.cn

在未来，我们将提供一个便捷方法，使您能够使用响应式 stream() 方法返回 Java 实体。与此同时，您应使用结构化输出转换器明确地将聚合响应进行转换，如下所示。这也展示了流畅 API 中参数的使用，相关内容将在文档的后续部分中更详细地讨论。spring-doc.cadn.net.cn

var converter = new BeanOutputConverter<>(new ParameterizedTypeReference<List<ActorsFilms>>() {});

Flux<String> flux = this.chatClient.prompt()
    .user(u -> u.text("""
                        Generate the filmography for a random actor.
                        {format}
                      """)
            .param("format", this.converter.getFormat()))
    .stream()
    .content();

String content = this.flux.collectList().block().stream().collect(Collectors.joining());

List<ActorsFilms> actorFilms = this.converter.convert(this.content);

提示模板

The ChatClient fluent API lets you provide user and system text as templates with variables that are replaced at runtime.spring-doc.cadn.net.cn

String answer = ChatClient.create(chatModel).prompt()
    .user(u -> u
            .text("Tell me the names of 5 movies whose soundtrack was composed by {composer}")
            .param("composer", "John Williams"))
    .call()
    .content();

在内部，ChatClient 使用 PromptTemplate 类来处理用户和系统文本，并根据给定的 TemplateRenderer 实现，在运行时将变量替换为提供的值。默认情况下，Spring AI 使用 StTemplateRenderer 实现，该实现基于 Terence Parr 开发的开源 StringTemplate 引擎。spring-doc.cadn.net.cn

Spring AI 也为不需要模板处理的情况提供了 NoOpTemplateRenderer。spring-doc.cadn.net.cn

直接在ChatClient上配置的TemplateRenderer（通过.templateRenderer()）仅适用于在ChatClient构建器链中直接定义的提示内容（例如，通过.user()、.system())。它不会影响Advisors内部使用的模板，如QuestionAnswerAdvisor,这些模板拥有自己的模板自定义机制（参见自定义顾问模板)。

如果您更倾向于使用其他模板引擎，可以直接向 ChatClient 提供 TemplateRenderer 接口的自定义实现。您也可以继续使用默认的 StTemplateRenderer,但需进行自定义配置。spring-doc.cadn.net.cn

例如，默认情况下，模板变量通过{}语法进行标识。如果您计划在提示中包含JSON，可能需要使用不同的语法以避免与JSON语法冲突。例如，您可以使用<和>作为分隔符。spring-doc.cadn.net.cn

String answer = ChatClient.create(chatModel).prompt()
    .user(u -> u
            .text("Tell me the names of 5 movies whose soundtrack was composed by <composer>")
            .param("composer", "John Williams"))
    .templateRenderer(StTemplateRenderer.builder().startDelimiterToken('<').endDelimiterToken('>').build())
    .call()
    .content();

调用() 返回值

在 ChatClient 上指定 call() 方法后，响应类型有几种不同的选择。spring-doc.cadn.net.cn

String content(): 返回响应的字符串内容spring-doc.cadn.net.cn
ChatResponse chatResponse(): 返回包含多代以及关于响应的元数据的 ChatResponse 对象，例如用于创建响应的标记数量。spring-doc.cadn.net.cn
ChatClientResponse chatClientResponse(): 返回一个包含 ChatClientResponse 对象的 ChatResponse 对象，该对象包含 ChatClient 执行上下文，使您能够访问在顾问执行期间使用的附加数据（例如，在 RAG 流程中检索的相关文档）。spring-doc.cadn.net.cn
ResponseEntity<?> responseEntity()：返回一个包含完整HTTP响应的ResponseEntity，包括状态码、头信息和主体。当你需要访问响应的底层HTTP详细信息时，这非常有用。spring-doc.cadn.net.cn
entity() 以返回 Java 类型spring-doc.cadn.net.cn
- entity(ParameterizedTypeReference<T> type): 用于返回实体类型的 Collection。spring-doc.cadn.net.cn
- entity(Class<T> type): 用于返回特定的实体类型。spring-doc.cadn.net.cn
- entity(StructuredOutputConverter<T> structuredOutputConverter): 用于指定一个 StructuredOutputConverter 的实例，以将 String 转换为实体类型。spring-doc.cadn.net.cn

您也可以调用stream()方法，而不是call()。spring-doc.cadn.net.cn

调用 call() 方法并不会真正触发 AI 模型的执行。相反，它仅指示 Spring AI 是使用同步调用还是流式调用。实际的 AI 模型调用发生在调用诸如 content()、chatResponse() 和 responseEntity() 等方法时。

流() 返回值

在 ChatClient 上指定 stream() 方法后，响应类型有几种选择：spring-doc.cadn.net.cn

Flux<String> content(): 返回由AI模型生成的字符串的 Flux。spring-doc.cadn.net.cn
Flux<ChatResponse> chatResponse(): 返回 Flux 的 ChatResponse 对象，其中包含有关响应的附加元数据。spring-doc.cadn.net.cn
Flux<ChatClientResponse> chatClientResponse(): 返回包含 ChatResponse 对象和 ChatClient 执行上下文的 ChatClientResponse 对象中的 Flux,使您能够访问在顾问执行期间使用的附加数据（例如，在 RAG 流程中检索的相关文档）。spring-doc.cadn.net.cn

使用默认值

在@Configuration类中创建一个带有默认系统文本的ChatClient，可以简化运行时代码。通过设置默认值，您只需在调用ChatClient时指定用户文本，从而无需在运行时代码路径中为每次请求都设置系统文本。spring-doc.cadn.net.cn

默认系统文本

在下面的示例中，我们将配置系统文本始终以海盗的口吻进行回复。为避免在运行时代码中重复系统文本，我们将在@Configuration类中创建一个ChatClient实例。spring-doc.cadn.net.cn

@Configuration
class Config {

    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("You are a friendly chat bot that answers question in the voice of a Pirate")
                .build();
    }

}

以及一个 @RestController 来调用它：spring-doc.cadn.net.cn

@RestController
class AIController {

	private final ChatClient chatClient;

	AIController(ChatClient chatClient) {
		this.chatClient = chatClient;
	}

	@GetMapping("/ai/simple")
	public Map<String, String> completion(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
		return Map.of("completion", this.chatClient.prompt().user(message).call().content());
	}
}

通过 curl 调用应用程序端点时，结果为：spring-doc.cadn.net.cn

❯ curl localhost:8080/ai/simple
{"completion":"Why did the pirate go to the comedy club? To hear some arrr-rated jokes! Arrr, matey!"}

默认系统文本，带参数

在下面的示例中，我们将使用系统文本中的占位符来指定完成时的语音，而不是在设计时指定。spring-doc.cadn.net.cn

@Configuration
class Config {

    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("You are a friendly chat bot that answers question in the voice of a {voice}")
                .build();
    }

}

@RestController
class AIController {
	private final ChatClient chatClient;

	AIController(ChatClient chatClient) {
		this.chatClient = chatClient;
	}

	@GetMapping("/ai")
	Map<String, String> completion(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message, String voice) {
		return Map.of("completion",
				this.chatClient.prompt()
						.system(sp -> sp.param("voice", voice))
						.user(message)
						.call()
						.content());
	}

}

通过 httpie 调用应用程序端点时，结果为：spring-doc.cadn.net.cn

http localhost:8080/ai voice=='Robert DeNiro'
{
    "completion": "You talkin' to me? Okay, here's a joke for ya: Why couldn't the bicycle stand up by itself? Because it was two tired! Classic, right?"
}

其他默认值

在ChatClient.Builder级别，您可以指定默认的提示配置。spring-doc.cadn.net.cn

defaultOptions(ChatOptions chatOptions): 传入在 ChatOptions 类中定义的可移植选项，或模型特定的选项，例如 OpenAiChatOptions 中的选项。有关模型特定 ChatOptions 实现的更多信息，请参阅 JavaDocs。spring-doc.cadn.net.cn
defaultFunction(String name, String description, java.util.function.Function<I, O> function): name 用于引用用户文本中的函数。 description 解释了函数的作用，并帮助 AI 模型选择正确的函数以获得准确的响应。 function 参数是 Java 函数实例，模型将在必要时执行该实例。spring-doc.cadn.net.cn
defaultFunctions(String… functionNames): 应用上下文中定义的 `java.util.Function` 的 Bean 名称。spring-doc.cadn.net.cn
defaultUser(String text), defaultUser(Resource text), defaultUser(Consumer<UserSpec> userSpecConsumer): 这些方法允许您定义用户文本。 Consumer<UserSpec> 允许您使用 lambda 来指定用户文本以及任何默认参数。spring-doc.cadn.net.cn
defaultAdvisors(Advisor… advisor): 顾问允许修改用于创建 Prompt 的数据。 QuestionAnswerAdvisor 实现通过将提示与用户文本相关的上下文信息附加在一起，从而启用 Retrieval Augmented Generation 模式。spring-doc.cadn.net.cn
defaultAdvisors(Consumer<AdvisorSpec> advisorSpecConsumer): 此方法允许您定义一个 Consumer 以使用 AdvisorSpec 配置多个顾问。顾问可以修改用于创建最终 Prompt 的数据。 Consumer<AdvisorSpec> 允许您指定一个 lambda 来添加顾问，例如 QuestionAnswerAdvisor,它通过根据用户文本附加相关上下文信息来支持 Retrieval Augmented Generation。spring-doc.cadn.net.cn

您可以在运行时使用相应的方法覆盖这些默认值，而无需使用default前缀。spring-doc.cadn.net.cn

options(ChatOptions chatOptions)spring-doc.cadn.net.cn
function(String name, String description, java.util.function.Function<I, O> function)spring-doc.cadn.net.cn
functions(String… functionNames)spring-doc.cadn.net.cn
user(String text), user(Resource text), user(Consumer<UserSpec> userSpecConsumer)spring-doc.cadn.net.cn
advisors(Advisor… advisor)spring-doc.cadn.net.cn
advisors(Consumer<AdvisorSpec> advisorSpecConsumer)spring-doc.cadn.net.cn

通知器

The Advisors API provides a flexible and powerful way to intercept, modify, and enhance AI-driven interactions in your Spring applications.spring-doc.cadn.net.cn

在使用用户文本调用 AI 模型时，一种常见的模式是将上下文数据附加或增强到提示中。spring-doc.cadn.net.cn

这些上下文数据可以是不同类型。常见的类型包括：spring-doc.cadn.net.cn

您自己的数据：这是AI模型未被训练过的数据。即使模型见过类似的数据，附加的上下文数据在生成响应时仍具有优先权。spring-doc.cadn.net.cn
对话历史：聊天模型的API是无状态的。如果您告诉AI模型您的名字，它在后续的交互中不会记住这个名字。每次请求时都必须发送对话历史，以确保在生成响应时考虑之前的交互。spring-doc.cadn.net.cn

聊天客户端中的顾问配置

ChatClient流畅API提供了一个AdvisorSpec接口用于配置顾问。该接口提供了添加参数、一次性设置多个参数以及向链中添加一个或多个顾问的方法。spring-doc.cadn.net.cn

interface AdvisorSpec {
    AdvisorSpec param(String k, Object v);
    AdvisorSpec params(Map<String, Object> p);
    AdvisorSpec advisors(Advisor... advisors);
    AdvisorSpec advisors(List<Advisor> advisors);
}

将顾问添加到链中的顺序至关重要，因为它决定了它们的执行顺序。每个顾问都会以某种方式修改提示或上下文，而前一个顾问所做的更改会传递给链中的下一个顾问。

ChatClient.builder(chatModel)
    .build()
    .prompt()
    .advisors(
        MessageChatMemoryAdvisor.builder(chatMemory).build(),
        QuestionAnswerAdvisor.builder(vectorStore).build()
    )
    .user(userText)
    .call()
    .content();

在此配置中，MessageChatMemoryAdvisor 将首先执行，将对话历史添加到提示中。然后，QuestionAnswerAdvisor 将根据用户的提问和已添加的对话历史进行搜索，从而可能提供更相关的结果。spring-doc.cadn.net.cn

了解问题回答顾问 spring-doc.cadn.net.cn

检索增强生成增强检索生成

请参考检索增强生成指南。spring-doc.cadn.net.cn

日志

SimpleLoggerAdvisor 是一个记录 request 和 response 的 ChatClient 数据的顾问。这对于调试和监控您的AI交互非常有用。spring-doc.cadn.net.cn

Spring AI 支持对 LLM 和向量存储交互的可观测性。请参考可观测性指南以获取更多信息。

要启用日志记录，请在创建 ChatClient 时将 SimpleLoggerAdvisor 添加到 advisor 链中。建议将其添加到链的末尾：spring-doc.cadn.net.cn

ChatResponse response = ChatClient.create(chatModel).prompt()
        .advisors(new SimpleLoggerAdvisor())
        .user("Tell me a joke?")
        .call()
        .chatResponse();

要查看日志，请将 advisor 包的日志级别设置为 DEBUG：spring-doc.cadn.net.cn

logging.level.org.springframework.ai.chat.client.advisor=DEBUG

将此添加到您的 application.properties 或 application.yaml 文件中。spring-doc.cadn.net.cn

您可以使用以下构造函数来自定义从 AdvisedRequest 和 ChatResponse 记录哪些数据：spring-doc.cadn.net.cn

SimpleLoggerAdvisor(
    Function<ChatClientRequest, String> requestToString,
    Function<ChatResponse, String> responseToString,
    int order
)

示例用法：spring-doc.cadn.net.cn

SimpleLoggerAdvisor customLogger = new SimpleLoggerAdvisor(
    request -> "Custom request: " + request.prompt().getUserMessage(),
    response -> "Custom response: " + response.getResult(),
    0
);

这使您能够根据自己的特定需求定制记录的信息。spring-doc.cadn.net.cn

在生产环境中记录敏感信息时需谨慎。

聊天记忆

接口 ChatMemory 表示用于存储聊天对话记忆的存储库。它提供了向对话中添加消息、从对话中检索消息以及清除对话历史记录的方法。spring-doc.cadn.net.cn

目前有一个内置实现：MessageWindowChatMemory。spring-doc.cadn.net.cn

MessageWindowChatMemory 是一种聊天记忆实现，它会维护一个消息窗口，最多保留指定的最大数量的消息（默认：20 条消息）。当消息数量超过此限制时，较早的消息将被移除，但系统消息会被保留。如果添加了新的系统消息，所有之前存在的系统消息都将从内存中删除。这确保了对话始终可以获取最新的上下文，同时使内存使用量保持在有限范围内。spring-doc.cadn.net.cn

MessageWindowChatMemory 由 ChatMemoryRepository 抽象提供支持，该抽象为聊天对话内存提供了存储实现。有多种实现可用，包括 InMemoryChatMemoryRepository、JdbcChatMemoryRepository、CassandraChatMemoryRepository 和 Neo4jChatMemoryRepository。spring-doc.cadn.net.cn

有关更多详细信息和使用示例，请参阅聊天内存文档。spring-doc.cadn.net.cn

实现说明

在 ChatClient 中，命令式和响应式编程模型的结合使用是该 API 的一个独特方面。通常，一个应用程序要么是响应式的，要么是命令式的，而不会同时兼具两者。spring-doc.cadn.net.cn

在自定义模型实现的HTTP客户端交互时，必须同时配置RestClient和WebClient。spring-doc.cadn.net.cn

由于 Spring Boot 3.4 中存在一个 bug，必须设置 "spring.http.client.factory=jdk" 属性。否则，默认值为 "reactor"，这会导致某些 AI 工作流（如 ImageModel）无法正常工作。spring-doc.cadn.net.cn

流式传输仅通过响应式堆栈支持。因此，命令式应用程序必须包含响应式堆栈（例如 spring-boot-starter-webflux）。spring-doc.cadn.net.cn
非流式处理仅通过 Servlet 栈支持。由于此原因，响应式应用程序必须包含 Servlet 栈（例如 spring-boot-starter-web），并预期某些调用是阻塞的。spring-doc.cadn.net.cn
工具调用是强制性的，会导致工作流阻塞。这还会导致 Micrometer 监测出现部分/中断的情况（例如，ChatClient 的跨度和工具调用的跨度没有连接，因此前者会一直处于未完成状态）。spring-doc.cadn.net.cn
内置的顾问器对标准调用执行阻塞操作，对流式调用执行非阻塞操作。用于顾问器流式调用的 Reactor Scheduler 可以通过每个 Advisor 类上的 Builder 进行配置。spring-doc.cadn.net.cn