人类对话

Anthropic Claude 是一系列基础性的人工智能模型，可用于多种应用。对于开发者和企业来说，你可以利用API访问，直接在Anthropic的AI基础设施上构建。spring-doc.cadn.net.cn

Spring AI 支持 Anthropic Messaging API 进行同步和流式文本生成。spring-doc.cadn.net.cn

Anthropic的Claude型号也可以通过亚马逊Bedrock Converse购买。 Spring AI 也提供专门的 Amazon Bedrock Converse Anthropic 客户端实现。

前提条件

你需要在Anthropic门户创建一个API密钥。spring-doc.cadn.net.cn

在 Anthropic API 仪表盘创建一个账户，并在获取 API 密钥页面生成 API 密钥。spring-doc.cadn.net.cn

Spring AI 项目定义了一个配置属性，名为spring.ai.anthropic.api-key。你应该设置为API 密钥从 anthropic.com 获得。spring-doc.cadn.net.cn

你可以在你的application.properties文件：spring-doc.cadn.net.cn

spring.ai.anthropic.api-key=<your-anthropic-api-key>

为了在处理敏感信息如 API 密钥时增强安全性，您可以使用 Spring 表达式语言（SpEL）引用自定义环境变量：spring-doc.cadn.net.cn

# In application.yml
spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}

# In your environment or .env file
export ANTHROPIC_API_KEY=<your-anthropic-api-key>

你也可以在应用代码中通过程序实现这个配置：spring-doc.cadn.net.cn

// Retrieve API key from a secure source or environment variable
String apiKey = System.getenv("ANTHROPIC_API_KEY");

添加仓库和物料清单

Spring AI 产物发布于 Maven Central 和 Spring Snapshot 仓库中。请参阅神器仓库部分，将这些仓库添加到你的构建系统中。spring-doc.cadn.net.cn

为帮助依赖管理，Spring AI 提供了物料清单（BOM），确保整个项目中使用一致版本的 Spring AI。请参考依赖管理部分，将Spring AI物料清单添加到你的构建系统中。spring-doc.cadn.net.cn

自动配置

Spring AI自动配置、起始模块的工件名称发生了重大变化。更多信息请参阅升级说明。spring-doc.cadn.net.cn

Spring AI 为 Anthropic Chat 客户端提供 Spring Boot 自动配置。要启用它，请在项目的Maven中添加以下依赖pom.xml或者Gradlebuild.gradle文件：spring-doc.cadn.net.cn

Mavenspring-doc.cadn.net.cn
Gradlespring-doc.cadn.net.cn

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
</dependency>

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-anthropic'
}

请参考依赖管理部分，将Spring AI的物料清单添加到你的构建文件中。

聊天属性

重试属性

前缀spring.ai.retry作为属性前缀，允许你配置Anthropic聊天模型的重试机制。spring-doc.cadn.net.cn

属性描述默认值

属性	描述	默认值
spring.ai.retry.max尝试spring-doc.cadn.net.cn	重试次数上限。spring-doc.cadn.net.cn	10spring-doc.cadn.net.cn
spring.ai.retry.backoff.initial-intervalspring-doc.cadn.net.cn	指数式退避政策的初始睡眠时长。spring-doc.cadn.net.cn	两秒钟。spring-doc.cadn.net.cn
spring.ai.retry.backoff.multiplierspring-doc.cadn.net.cn	后退间隔倍数。spring-doc.cadn.net.cn	5spring-doc.cadn.net.cn
spring.ai.retry.backoff.max区间spring-doc.cadn.net.cn	最大退回时间。spring-doc.cadn.net.cn	3分钟。spring-doc.cadn.net.cn
spring.ai.retry.on-client-errorsspring-doc.cadn.net.cn	如果为假，抛出非瞬态AiException，且不要尝试重试`4xx`客户端错误代码spring-doc.cadn.net.cn	falsespring-doc.cadn.net.cn
spring.ai.retry.exclude-on-http-codesspring-doc.cadn.net.cn	不应该触发重试的HTTP状态码列表（例如抛出NonTransientAiException）。spring-doc.cadn.net.cn	emptyspring-doc.cadn.net.cn
spring.ai.retry.on-http-codesspring-doc.cadn.net.cn	应触发重试的HTTP状态码列表（例如抛出TransientAiException）。spring-doc.cadn.net.cn	emptyspring-doc.cadn.net.cn

spring.ai.retry.max尝试spring-doc.cadn.net.cn

重试次数上限。spring-doc.cadn.net.cn

10spring-doc.cadn.net.cn

spring.ai.retry.backoff.initial-intervalspring-doc.cadn.net.cn

指数式退避政策的初始睡眠时长。spring-doc.cadn.net.cn

两秒钟。spring-doc.cadn.net.cn

spring.ai.retry.backoff.multiplierspring-doc.cadn.net.cn

后退间隔倍数。spring-doc.cadn.net.cn

5spring-doc.cadn.net.cn

spring.ai.retry.backoff.max区间spring-doc.cadn.net.cn

最大退回时间。spring-doc.cadn.net.cn

3分钟。spring-doc.cadn.net.cn

spring.ai.retry.on-client-errorsspring-doc.cadn.net.cn

如果为假，抛出非瞬态AiException，且不要尝试重试4xx客户端错误代码spring-doc.cadn.net.cn

falsespring-doc.cadn.net.cn

spring.ai.retry.exclude-on-http-codesspring-doc.cadn.net.cn

不应该触发重试的HTTP状态码列表（例如抛出NonTransientAiException）。spring-doc.cadn.net.cn

emptyspring-doc.cadn.net.cn

spring.ai.retry.on-http-codesspring-doc.cadn.net.cn

应触发重试的HTTP状态码列表（例如抛出TransientAiException）。spring-doc.cadn.net.cn

emptyspring-doc.cadn.net.cn

目前重试策略不适用于流媒体API。

连接性质

前缀春.ai.人类作为属性前缀，允许你连接到Anthropic。spring-doc.cadn.net.cn

属性描述默认值

属性	描述	默认值
spring.ai.anthropic.base-urlspring-doc.cadn.net.cn	连接的URL（链接）spring-doc.cadn.net.cn	api.anthropic.com spring-doc.cadn.net.cn
spring.ai.anthropic.completions-pathspring-doc.cadn.net.cn	附加到基础URL的路径。spring-doc.cadn.net.cn	`/v1/聊天/完成`spring-doc.cadn.net.cn
spring.ai.anthropic.version.spring-doc.cadn.net.cn	Anthropic API 版本spring-doc.cadn.net.cn	2023-06-01spring-doc.cadn.net.cn
spring.ai.anthropic.api-key。spring-doc.cadn.net.cn	API 密钥spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.beta-version（春.ai.人类.beta版）spring-doc.cadn.net.cn	启用新的/实验性功能。如果设置为`max-tokens-3-5-sonnet-2024-07-15`输出Tokens的限制由以下增加`4096`自`8192`Tokens（仅限Claude-3-5-Sonnet）。spring-doc.cadn.net.cn	`工具-2024-04-04`spring-doc.cadn.net.cn

spring.ai.anthropic.base-urlspring-doc.cadn.net.cn

连接的URL（链接）spring-doc.cadn.net.cn

api.anthropic.com spring-doc.cadn.net.cn

spring.ai.anthropic.completions-pathspring-doc.cadn.net.cn

附加到基础URL的路径。spring-doc.cadn.net.cn

/v1/聊天/完成spring-doc.cadn.net.cn

spring.ai.anthropic.version.spring-doc.cadn.net.cn

Anthropic API 版本spring-doc.cadn.net.cn

2023-06-01spring-doc.cadn.net.cn

spring.ai.anthropic.api-key。spring-doc.cadn.net.cn

API 密钥spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.beta-version（春.ai.人类.beta版）spring-doc.cadn.net.cn

启用新的/实验性功能。如果设置为max-tokens-3-5-sonnet-2024-07-15输出Tokens的限制由以下增加4096自8192Tokens（仅限Claude-3-5-Sonnet）。spring-doc.cadn.net.cn

工具-2024-04-04spring-doc.cadn.net.cn

配置属性

现在，启用和禁用聊天自动配置通过带有前缀的顶层属性进行配置spring.ai.model.chat.spring-doc.cadn.net.cn

启用时，spring.ai.model.chat=anthropic（默认启用）spring-doc.cadn.net.cn

要禁用，请使用spring.ai.model.chat=none（或任何与anthropic不匹配的值）spring-doc.cadn.net.cn

此改动旨在允许配置多个模型。spring-doc.cadn.net.cn

前缀Spring.ai.anthropic.chat是属性前缀，用于配置Anthropic的聊天模型实现。spring-doc.cadn.net.cn

属性描述默认值

属性	描述	默认值
spring.ai.anthropic.chat.enabled（已移除且不再有效）spring-doc.cadn.net.cn	启用拟人聊天模式。spring-doc.cadn.net.cn	truespring-doc.cadn.net.cn
spring.ai.model.chatspring-doc.cadn.net.cn	启用拟人聊天模式。spring-doc.cadn.net.cn	人为spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.modelspring-doc.cadn.net.cn	这就是拟人聊天的模型。支持：`克洛德作品-4-0`,`克洛德十四行诗-4-0`,`Claude-3-7-Sonnet-latest`,`Claude-3-5-Sonnet-latest`,`Claude-3-opus-20240229`,`克劳德-3-十四行诗-20240229`,`Claude-3-Haiku-20240307`,`Claude-3-7-Sonnet-latest`,`克劳德十四行诗-4-20250514`,`克劳德作品-4-1-20250805`spring-doc.cadn.net.cn	`克劳德作品-4-20250514`spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.temperature.temperaturespring-doc.cadn.net.cn	采样温度决定了生成完备的显得创造力。值越高，输出越随机，值越低，结果越聚焦且确定性强。不建议对相同的完井请求调整温度和top_p，因为这两种设置的相互作用难以预测。spring-doc.cadn.net.cn	0.8spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.max标记spring-doc.cadn.net.cn	聊天补全时最多可生成的Tokens数。输入标记和生成标记的总长度受模型上下文长度限制。spring-doc.cadn.net.cn	500spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.stop-sequencespring-doc.cadn.net.cn	自定义文本序列会导致模型停止生成。我们的模型通常会在自然完成回合后停止，这会导致响应stop_reason“end_turn”。如果你希望模型在遇到自定义字符串时停止生成，可以使用stop_sequences参数。如果模型遇到自定义序列之一，响应stop_reason值为“stop_sequence”，响应stop_sequence值包含匹配的停止序列。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.top-pspring-doc.cadn.net.cn	使用核抽样。在核抽样中，我们计算每个后续Tokens的累积分布，按递减概率顺序计算，并在达到top_p指定的特定概率时将其截止。你应该调整温度或top_p，但不要两者兼有。仅推荐用于高级应用场景。通常只需要用温度。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.top-kspring-doc.cadn.net.cn	每个后续Tokens只从顶部的K个选项中抽样。用于消除“长尾”低概率反应。点击这里了解更多技术细节。仅推荐用于高级应用场景。通常只需要用温度。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.tool-namesspring-doc.cadn.net.cn	工具列表，按名称标识，以便在一次提示请求中实现工具调用。这些名称的工具必须存在于 toolCallbacks 注册表中。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.tool-callbacksspring-doc.cadn.net.cn	工具回调以注册到聊天模型。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.toolChoicespring-doc.cadn.net.cn	控制模型调用的（如果有）工具。`没有`意味着模型不会调用函数，而是生成消息。`自动`意味着模型可以在生成消息或调用工具之间选择。指定特定工具`{“类型：”工具“， ”名称“：”my_tool“}`强制模型调用该工具。`没有`当没有函数存在时，是默认的。`自动`如果函数存在，则是默认的。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.internal-tool-execution-enabledspring-doc.cadn.net.cn	如果为 false，Spring AI 不会在内部处理工具调用，而是代理到客户端。然后由客户端负责处理工具调用，将其派遣到相应函数，并返回结果。如果为真（默认为真），Spring AI 会在内部处理函数调用。仅适用于支持函数调用的聊天模型spring-doc.cadn.net.cn	truespring-doc.cadn.net.cn
spring.ai.anthropic.chat.options.http-headersspring-doc.cadn.net.cn	在聊天补全请求中添加可选的HTTP头。spring-doc.cadn.net.cn	-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.enabled（已移除且不再有效）spring-doc.cadn.net.cn

启用拟人聊天模式。spring-doc.cadn.net.cn

truespring-doc.cadn.net.cn

spring.ai.model.chatspring-doc.cadn.net.cn

启用拟人聊天模式。spring-doc.cadn.net.cn

人为spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.modelspring-doc.cadn.net.cn

这就是拟人聊天的模型。支持：克洛德作品-4-0,克洛德十四行诗-4-0,Claude-3-7-Sonnet-latest,Claude-3-5-Sonnet-latest,Claude-3-opus-20240229,克劳德-3-十四行诗-20240229,Claude-3-Haiku-20240307,Claude-3-7-Sonnet-latest,克劳德十四行诗-4-20250514,克劳德作品-4-1-20250805spring-doc.cadn.net.cn

克劳德作品-4-20250514spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.temperature.temperaturespring-doc.cadn.net.cn

采样温度决定了生成完备的显得创造力。值越高，输出越随机，值越低，结果越聚焦且确定性强。不建议对相同的完井请求调整温度和top_p，因为这两种设置的相互作用难以预测。spring-doc.cadn.net.cn

0.8spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.max标记spring-doc.cadn.net.cn

聊天补全时最多可生成的Tokens数。输入标记和生成标记的总长度受模型上下文长度限制。spring-doc.cadn.net.cn

500spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.stop-sequencespring-doc.cadn.net.cn

自定义文本序列会导致模型停止生成。我们的模型通常会在自然完成回合后停止，这会导致响应stop_reason“end_turn”。如果你希望模型在遇到自定义字符串时停止生成，可以使用stop_sequences参数。如果模型遇到自定义序列之一，响应stop_reason值为“stop_sequence”，响应stop_sequence值包含匹配的停止序列。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.top-pspring-doc.cadn.net.cn

使用核抽样。在核抽样中，我们计算每个后续Tokens的累积分布，按递减概率顺序计算，并在达到top_p指定的特定概率时将其截止。你应该调整温度或top_p，但不要两者兼有。仅推荐用于高级应用场景。通常只需要用温度。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.top-kspring-doc.cadn.net.cn

每个后续Tokens只从顶部的K个选项中抽样。用于消除“长尾”低概率反应。点击这里了解更多技术细节。仅推荐用于高级应用场景。通常只需要用温度。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.tool-namesspring-doc.cadn.net.cn

工具列表，按名称标识，以便在一次提示请求中实现工具调用。这些名称的工具必须存在于 toolCallbacks 注册表中。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.tool-callbacksspring-doc.cadn.net.cn

工具回调以注册到聊天模型。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.toolChoicespring-doc.cadn.net.cn

控制模型调用的（如果有）工具。没有意味着模型不会调用函数，而是生成消息。自动意味着模型可以在生成消息或调用工具之间选择。指定特定工具{“类型：”工具“， ”名称“：”my_tool“}强制模型调用该工具。没有当没有函数存在时，是默认的。自动如果函数存在，则是默认的。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.internal-tool-execution-enabledspring-doc.cadn.net.cn

如果为 false，Spring AI 不会在内部处理工具调用，而是代理到客户端。然后由客户端负责处理工具调用，将其派遣到相应函数，并返回结果。如果为真（默认为真），Spring AI 会在内部处理函数调用。仅适用于支持函数调用的聊天模型spring-doc.cadn.net.cn

truespring-doc.cadn.net.cn

spring.ai.anthropic.chat.options.http-headersspring-doc.cadn.net.cn

在聊天补全请求中添加可选的HTTP头。spring-doc.cadn.net.cn

-spring-doc.cadn.net.cn

有关最新的模型别名列表及其描述，请参见官方Anthropic模型别名文档。

所有以spring.ai.anthropic.chat.options可以通过在运行时为在提示叫。

运行时选项

AnthropicChatOptions.java提供模型配置，如使用模型、温度、最大Tokens数等。spring-doc.cadn.net.cn

启动时，默认选项可以配置为AnthropicChatModel（api，options）构造者或spring.ai.anthropic.chat.options.*性能。spring-doc.cadn.net.cn

运行时，你可以通过添加新的、请求专属的选项来覆盖默认选项，提示叫。例如，针对特定请求覆盖默认型号和温度：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        AnthropicChatOptions.builder()
            .model("claude-3-7-sonnet-latest")
            .temperature(0.4)
        .build()
    ));

除了模型特定的AnthropicChatOptions，你还可以使用用ChatOptions#builder（）创建的便携ChatOptions实例。

提示缓存

Anthropic的提示缓存功能允许你缓存常用提示，以降低成本并缩短重复交互的响应时间。当你缓存提示时，后续相同的请求可以重复使用缓存内容，显著减少处理的输入标记数量。spring-doc.cadn.net.cn

支持的型号spring-doc.cadn.net.cn

提示缓存目前支持于Claude Opus 4、Claude Sonnet 4、Claude Sonnet 3.7、Claude Ismnet 3.5、Claude Haiku 3.5、Claude Haiku 3和Claude Opus 3。spring-doc.cadn.net.cn

Tokens要求spring-doc.cadn.net.cn

不同模型对缓存有效性的最低Tokens阈值不同：- Claude Sonnet 4：1024+ Tokens- Claude Haiku 模型：2048+ Tokens- 其他模型：1024+ Tokensspring-doc.cadn.net.cn

缓存策略

Spring AI 通过AnthropicCacheStrategyenum。每种策略都会自动将缓存断点放置在最佳位置，同时保持在Anthropic的4个断点限制内。spring-doc.cadn.net.cn

策略使用的断点用例

策略	使用的断点	用例
`没有`spring-doc.cadn.net.cn	0spring-doc.cadn.net.cn	完全禁用提示缓存。当请求是一次性的或内容过小无法从缓存中受益时使用。spring-doc.cadn.net.cn
`SYSTEM_ONLY`spring-doc.cadn.net.cn	1spring-doc.cadn.net.cn	缓存系统消息内容。工具通过 Anthropic 的自动 ~20 块回查机制隐式缓存。当系统提示符大且稳定且工具少于 20 个时使用。spring-doc.cadn.net.cn
`TOOLS_ONLY`spring-doc.cadn.net.cn	1spring-doc.cadn.net.cn	仅缓存工具定义。系统消息保持未缓存状态，每次请求都会全新处理。当工具定义大且稳定（5000+ Tokens）但系统提示频繁变化或因租户/上下文而异时使用。spring-doc.cadn.net.cn
`SYSTEM_AND_TOOLS`spring-doc.cadn.net.cn	2spring-doc.cadn.net.cn	显式缓存工具定义（断点1）和系统消息（断点2）。当你有20+工具（超出自动回看）或希望两个组件都确定性缓存时使用。系统变更不会使工具缓存失效。spring-doc.cadn.net.cn
`CONVERSATION_HISTORY`spring-doc.cadn.net.cn	1-4spring-doc.cadn.net.cn	缓存整个对话历史，直到当前用户问题。用于多回合对话，带有聊天记忆，随着时间增长对话历史。spring-doc.cadn.net.cn

没有spring-doc.cadn.net.cn

0spring-doc.cadn.net.cn

完全禁用提示缓存。当请求是一次性的或内容过小无法从缓存中受益时使用。spring-doc.cadn.net.cn

SYSTEM_ONLYspring-doc.cadn.net.cn

1spring-doc.cadn.net.cn

缓存系统消息内容。工具通过 Anthropic 的自动 ~20 块回查机制隐式缓存。当系统提示符大且稳定且工具少于 20 个时使用。spring-doc.cadn.net.cn

TOOLS_ONLYspring-doc.cadn.net.cn

1spring-doc.cadn.net.cn

仅缓存工具定义。系统消息保持未缓存状态，每次请求都会全新处理。当工具定义大且稳定（5000+ Tokens）但系统提示频繁变化或因租户/上下文而异时使用。spring-doc.cadn.net.cn

SYSTEM_AND_TOOLSspring-doc.cadn.net.cn

2spring-doc.cadn.net.cn

显式缓存工具定义（断点1）和系统消息（断点2）。当你有20+工具（超出自动回看）或希望两个组件都确定性缓存时使用。系统变更不会使工具缓存失效。spring-doc.cadn.net.cn

CONVERSATION_HISTORYspring-doc.cadn.net.cn

1-4spring-doc.cadn.net.cn

缓存整个对话历史，直到当前用户问题。用于多回合对话，带有聊天记忆，随着时间增长对话历史。spring-doc.cadn.net.cn

由于 Anthropic 的级联失效机制，更改工具定义将使所有下游缓存断点（系统、消息）失效。工具稳定性在使用SYSTEM_AND_TOOLS或CONVERSATION_HISTORY策略。

启用提示缓存

通过设置启用提示缓存缓存选项上人类聊天选项并选择一个策略.spring-doc.cadn.net.cn

仅系统缓存

最佳用途：稳定系统提示，包含<20个工具（通过自动回看隐式缓存的工具）。spring-doc.cadn.net.cn

// Cache system message content (tools cached implicitly)
ChatResponse response = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage("You are a helpful AI assistant with extensive knowledge..."),
            new UserMessage("What is machine learning?")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                .build())
            .maxTokens(500)
            .build()
    )
);

仅工具缓存

最佳用途：大型稳定工具集，配备动态系统提示（多租户应用，A/B测试）。spring-doc.cadn.net.cn

// Cache tool definitions, system prompt processed fresh each time
ChatResponse response = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage("You are a " + persona + " assistant..."), // Dynamic per-tenant
            new UserMessage("What's the weather like in San Francisco?")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.TOOLS_ONLY)
                .build())
            .toolCallbacks(weatherToolCallback) // Large tool set cached
            .maxTokens(500)
            .build()
    )
);

系统与工具缓存

最佳用途：20+ 工具（超越自动回看）或两个组件应独立缓存时。spring-doc.cadn.net.cn

// Cache both tool definitions and system message with independent breakpoints
// Changing system won't invalidate tool cache (but changing tools invalidates both)
ChatResponse response = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage("You are a weather analysis assistant..."),
            new UserMessage("What's the weather like in San Francisco?")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_AND_TOOLS)
                .build())
            .toolCallbacks(weatherToolCallback) // 20+ tools
            .maxTokens(500)
            .build()
    )
);

对话历史缓存

// Cache conversation history with ChatClient and memory (cache breakpoint on last user message)
ChatClient chatClient = ChatClient.builder(chatModel)
    .defaultSystem("You are a personalized career counselor...")
    .defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory)
        .conversationId(conversationId)
        .build())
    .build();

String response = chatClient.prompt()
    .user("What career advice would you give me?")
    .options(AnthropicChatOptions.builder()
        .model("claude-sonnet-4")
        .cacheOptions(AnthropicCacheOptions.builder()
            .strategy(AnthropicCacheStrategy.CONVERSATION_HISTORY)
            .build())
        .maxTokens(500)
        .build())
    .call()
    .content();

使用 ChatClient Fluent API

String response = ChatClient.create(chatModel)
    .prompt()
    .system("You are an expert document analyst...")
    .user("Analyze this large document: " + document)
    .options(AnthropicChatOptions.builder()
        .model("claude-sonnet-4")
        .cacheOptions(AnthropicCacheOptions.builder()
            .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
            .build())
        .build())
    .call()
    .content();

高级缓存选项

每条消息TTL（5分钟或1小时）

默认情况下，缓存内容使用5分钟的TTL。你可以为特定消息类型设置1小时的TTL。当使用 1 小时 TTL 时，Spring AI 会自动设置所需的 Anthropic beta 头部。spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        List.of(new SystemMessage(largeSystemPrompt)),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                .messageTypeTtl(MessageType.SYSTEM, AnthropicCacheTtl.ONE_HOUR)
                .build())
            .maxTokens(500)
            .build()
    )
);

扩展TTL使用Anthropic测试版功能扩展缓存-TTL-2025-04-11.

缓存资格筛选器

通过设置最小内容长度和可选的基于Tokens的长度函数来控制缓存断点的使用：spring-doc.cadn.net.cn

AnthropicCacheOptions cache = AnthropicCacheOptions.builder()
    .strategy(AnthropicCacheStrategy.CONVERSATION_HISTORY)
    .messageTypeMinContentLength(MessageType.SYSTEM, 1024)
    .messageTypeMinContentLength(MessageType.USER, 1024)
    .messageTypeMinContentLength(MessageType.ASSISTANT, 1024)
    .contentLengthFunction(text -> MyTokenCounter.count(text))
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        List.of(/* messages */),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(cache)
            .build()
    )
);

工具定义总是被考虑用于缓存，如果SYSTEM_AND_TOOLS无论内容长度如何，都会采用策略。

使用示例

这里有一个完整的示例，展示了带成本跟踪的提示缓存：spring-doc.cadn.net.cn

// Create system content that will be reused multiple times
String largeSystemPrompt = "You are an expert software architect specializing in distributed systems...";

// First request - creates cache
ChatResponse firstResponse = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage(largeSystemPrompt),
            new UserMessage("What is microservices architecture?")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                .build())
            .maxTokens(500)
            .build()
    )
);

// Access cache-related token usage
AnthropicApi.Usage firstUsage = (AnthropicApi.Usage) firstResponse.getMetadata()
    .getUsage().getNativeUsage();

System.out.println("Cache creation tokens: " + firstUsage.cacheCreationInputTokens());
System.out.println("Cache read tokens: " + firstUsage.cacheReadInputTokens());

// Second request with same system prompt - reads from cache
ChatResponse secondResponse = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage(largeSystemPrompt),
            new UserMessage("What are the benefits of event sourcing?")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                .build())
            .maxTokens(500)
            .build()
    )
);

AnthropicApi.Usage secondUsage = (AnthropicApi.Usage) secondResponse.getMetadata()
    .getUsage().getNativeUsage();

System.out.println("Cache creation tokens: " + secondUsage.cacheCreationInputTokens()); // Should be 0
System.out.println("Cache read tokens: " + secondUsage.cacheReadInputTokens()); // Should be > 0

Tokens使用跟踪

这用法Record提供了关于缓存相关Tokens消耗的详细信息。要访问Anthropic特定的缓存指标，请使用getNativeUsage（）方法：spring-doc.cadn.net.cn

AnthropicApi.Usage usage = (AnthropicApi.Usage) response.getMetadata()
    .getUsage().getNativeUsage();

缓存专用指标包括：spring-doc.cadn.net.cn

cacheCreationInputTokens（）返回创建缓存条目时使用的Tokens数量spring-doc.cadn.net.cn
cacheReadInputTokens（）返回从现有缓存条目读取的Tokens数量spring-doc.cadn.net.cn

当你第一次发送缓存提示时： -cacheCreationInputTokens（）将大于 0 -cacheReadInputTokens（）将为0spring-doc.cadn.net.cn

当你再次发送相同的缓存提示时： -cacheCreationInputTokens（）将为0 -cacheReadInputTokens（）将大于 0spring-doc.cadn.net.cn

实际应用场景

法律文件分析

通过缓存多个问题的文档内容，高效分析大型法律合同或合规文件：spring-doc.cadn.net.cn

// Load a legal contract (PDF or text)
String legalContract = loadDocument("merger-agreement.pdf"); // ~3000 tokens

// System prompt with legal expertise
String legalSystemPrompt = "You are an expert legal analyst specializing in corporate law. " +
    "Analyze the following contract and provide precise answers about terms, obligations, and risks: " +
    legalContract;

// First analysis - creates cache
ChatResponse riskAnalysis = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage(legalSystemPrompt),
            new UserMessage("What are the key termination clauses and associated penalties?")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                .build())
            .maxTokens(1000)
            .build()
    )
);

// Subsequent questions reuse cached document - 90% cost savings
ChatResponse obligationAnalysis = chatModel.call(
    new Prompt(
        List.of(
            new SystemMessage(legalSystemPrompt), // Same content - cache hit
            new UserMessage("List all financial obligations and payment schedules.")
        ),
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .cacheOptions(AnthropicCacheOptions.builder()
                .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                .build())
            .maxTokens(1000)
            .build()
    )
);

批处理代码审查

在缓存审查指南的同时，处理多个符合一致审查标准的代码文件：spring-doc.cadn.net.cn

// Define comprehensive code review guidelines
String reviewGuidelines = """
    You are a senior software engineer conducting code reviews. Apply these criteria:
    - Security vulnerabilities and best practices
    - Performance optimizations and memory usage
    - Code maintainability and readability
    - Testing coverage and edge cases
    - Design patterns and architecture compliance
    """;

List<String> codeFiles = Arrays.asList(
    "UserService.java", "PaymentController.java", "SecurityConfig.java"
);

List<String> reviews = new ArrayList<>();

for (String filename : codeFiles) {
    String sourceCode = loadSourceFile(filename);

    ChatResponse review = chatModel.call(
        new Prompt(
            List.of(
                new SystemMessage(reviewGuidelines), // Cached across all reviews
                new UserMessage("Review this " + filename + " code:\n\n" + sourceCode)
            ),
            AnthropicChatOptions.builder()
                .model("claude-sonnet-4")
                .cacheOptions(AnthropicCacheOptions.builder()
                    .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                    .build())
                .maxTokens(800)
                .build()
        )
    );

    reviews.add(review.getResult().getOutput().getText());
}

// Guidelines cached after first request, subsequent reviews are faster and cheaper

多租户SaaS与共享工具

构建一个多租户应用，工具共享，但系统提示按租户定制：spring-doc.cadn.net.cn

// Define large shared tool set (used by all tenants)
List<FunctionCallback> sharedTools = Arrays.asList(
    weatherToolCallback,    // ~500 tokens
    calendarToolCallback,   // ~800 tokens
    emailToolCallback,      // ~700 tokens
    analyticsToolCallback,  // ~600 tokens
    reportingToolCallback,  // ~900 tokens
    // ... 20+ more tools, totaling 5000+ tokens
);

@Service
public class MultiTenantAIService {

    public String handleTenantRequest(String tenantId, String userQuery) {
        // Get tenant-specific configuration
        TenantConfig config = tenantRepository.findById(tenantId);

        // Dynamic system prompt per tenant
        String tenantSystemPrompt = String.format("""
            You are %s's AI assistant. Company values: %s.
            Brand voice: %s. Compliance requirements: %s.
            """, config.companyName(), config.values(),
                 config.brandVoice(), config.compliance());

        ChatResponse response = chatModel.call(
            new Prompt(
                List.of(
                    new SystemMessage(tenantSystemPrompt), // Different per tenant, NOT cached
                    new UserMessage(userQuery)
                ),
                AnthropicChatOptions.builder()
                    .model("claude-sonnet-4")
                    .cacheOptions(AnthropicCacheOptions.builder()
                        .strategy(AnthropicCacheStrategy.TOOLS_ONLY) // Cache tools only
                        .build())
                    .toolCallbacks(sharedTools) // Cached once, shared across all tenants
                    .maxTokens(800)
                    .build()
            )
        );

        return response.getResult().getOutput().getText();
    }
}

// Tools cached once (5000 tokens @ 10% = 500 token cost for cache hits)
// Each tenant's unique system prompt processed fresh (200-500 tokens @ 100%)
// Total per request: ~700-1000 tokens vs 5500+ without TOOLS_ONLY

带知识库的客户支持

创建一个客户支持系统，缓存您的产品知识库，以获得一致且准确的回复：spring-doc.cadn.net.cn

// Load comprehensive product knowledge
String knowledgeBase = """
    PRODUCT DOCUMENTATION:
    - API endpoints and authentication methods
    - Common troubleshooting procedures
    - Billing and subscription details
    - Integration guides and examples
    - Known issues and workarounds
    """ + loadProductDocs(); // ~2500 tokens

@Service
public class CustomerSupportService {

    public String handleCustomerQuery(String customerQuery, String customerId) {
        ChatResponse response = chatModel.call(
            new Prompt(
                List.of(
                    new SystemMessage("You are a helpful customer support agent. " +
                        "Use this knowledge base to provide accurate solutions: " + knowledgeBase),
                    new UserMessage("Customer " + customerId + " asks: " + customerQuery)
                ),
                AnthropicChatOptions.builder()
                    .model("claude-sonnet-4")
                    .cacheOptions(AnthropicCacheOptions.builder()
                        .strategy(AnthropicCacheStrategy.SYSTEM_ONLY)
                        .build())
                    .maxTokens(600)
                    .build()
            )
        );

        return response.getResult().getOutput().getText();
    }
}

// Knowledge base is cached across all customer queries
// Multiple support agents can benefit from the same cached content

最佳实践

选择合适的策略：spring-doc.cadn.net.cn
- 用SYSTEM_ONLY对于带有<20工具（通过自动回溯隐式缓存的工具）的稳定系统提示spring-doc.cadn.net.cn
- 用TOOLS_ONLY适用于大型稳定工具集（5000+ Tokens），并带有动态系统提示（多租户，A/B 测试）spring-doc.cadn.net.cn
- 用SYSTEM_AND_TOOLS当你有20+工具（超出自动回调）或想两个工具独立缓存时spring-doc.cadn.net.cn
- 用CONVERSATION_HISTORY多回合对话时，配备ChatClient内存spring-doc.cadn.net.cn
- 用没有明确禁用缓存spring-doc.cadn.net.cn
理解级联失效：Anthropic的缓存层级结构（工具→系统→消息）表示变化向动：spring-doc.cadn.net.cn
- 更改工具会使工具失效：工具 + 系统 + 消息（所有缓存） ❌❌❌spring-doc.cadn.net.cn
- 更改系统使：system + 消息失效（工具缓存依然有效） ✅❌❌spring-doc.cadn.net.cn
- 更改消息会使消息失效：仅限消息（工具和系统缓存仍然有效） ✅✅❌spring-doc.cadn.net.cn
  **Tool stability is critical** when using `SYSTEM_AND_TOOLS` or `CONVERSATION_HISTORY` strategies.
SYSTEM_AND_TOOLS 独立性：其中SYSTEM_AND_TOOLS更改系统消息不会使工具缓存失效，即使系统提示变化，缓存工具也能高效重用。spring-doc.cadn.net.cn
满足Tokens要求：重点缓存符合最低Tokens要求的内容（Sonnet 4 需 1024+ Tokens，俳句模型需 2048+）。spring-doc.cadn.net.cn
重复使用相同内容：缓存在提示内容的精确匹配中效果最佳。即使是小改动也需要重新创建缓存条目。spring-doc.cadn.net.cn
监控Tokens使用情况：利用缓存使用统计数据跟踪缓存的有效性：Java AnthropicApi.Usage usage = （AnthropicApi.Usage） response.getMetadata（）.getUsage（）.getNativeUsage（）; 如果（用法！= null） { System.out.println（“缓存创建： ” + usage.cacheCreationInputTokens（））; System.out.println（“缓存读取： ” + usage.cacheReadInputTokens（））; }spring-doc.cadn.net.cn
战略性缓存布置：实现会自动根据您选择的策略将缓存断点放置在最佳位置，确保符合Anthropic的4个断点限制。spring-doc.cadn.net.cn
缓存寿命：默认TTL为5分钟;通过以下方式设置每个消息类型的1小时TTLmessageTypeTtl（...）.每次缓存访问都会重置计时器。spring-doc.cadn.net.cn
工具缓存限制：请注意，基于工具的交互可能不会在响应中提供缓存使用元数据。spring-doc.cadn.net.cn

实现细节

Spring AI 中的提示缓存实现遵循以下关键设计原则：spring-doc.cadn.net.cn

战略性缓存布置：缓存断点会根据所选策略自动放置在最佳位置，确保符合Anthropic的4个断点限制。spring-doc.cadn.net.cn
- CONVERSATION_HISTORY缓存断点的配置包括：工具（如存在）、系统消息和最后用户消息spring-doc.cadn.net.cn
- 这使得Anthropic的前缀匹配能够逐步缓存不断增长的对话历史spring-doc.cadn.net.cn
- 每回合基于之前的缓存前缀，最大化缓存重用spring-doc.cadn.net.cn
提供者可移植性：缓存配置通过以下方式完成人类聊天选项而不是单独发送消息，在切换不同AI提供商时保持兼容性。spring-doc.cadn.net.cn
线程安全：缓存断点跟踪采用线程安全机制实现，以正确处理并发请求。spring-doc.cadn.net.cn
自动内容排序：实现确保 JSON 内容块和缓存控制的线上正确排序，符合 Anthropic 的 API 要求。spring-doc.cadn.net.cn
综合资格审查：针对CONVERSATION_HISTORY实现在判断合并内容是否达到缓存的最低Tokens阈值时，会考虑最近~20个内容块内的所有消息类型（用户、助手、工具）。spring-doc.cadn.net.cn

未来改进

当前的缓存策略设计用来有效处理90%的常见用例。对于需要更细粒度控制的应用，未来可能包括：spring-doc.cadn.net.cn

用于细粒度断点放置的消息级缓存控制spring-doc.cadn.net.cn
单个消息中的多块内容缓存spring-doc.cadn.net.cn
复杂工具场景的高级缓存边界选择spring-doc.cadn.net.cn
用于优化缓存层级的混合TTL策略spring-doc.cadn.net.cn

这些改进将保持完全向后兼容，同时解锁 Anthropic 针对特定用例的完整提示缓存功能。spring-doc.cadn.net.cn

思维

拟人Claude模型支持“思考”功能，允许模型展示其推理过程，然后再给出最终答案。这一功能使问题解决更加透明和细致，尤其适用于需要逐步推理的复杂问题。spring-doc.cadn.net.cn

支持的型号spring-doc.cadn.net.cn

思考功能由以下Claude模型支持：spring-doc.cadn.net.cn

Claude 4 型号（克劳德作品-4-20250514,克劳德十四行诗-4-20250514)spring-doc.cadn.net.cn
克洛德3.7十四行诗（克劳德-3-7-十四行诗-20250219)spring-doc.cadn.net.cn

模型能力：spring-doc.cadn.net.cn

Claude 3.7 十四行诗：返回完整的思考输出。行为是一致的，但不支持总结性或交错式思维。spring-doc.cadn.net.cn
Claude 4 模型：支持汇总思维、交错思维和增强工具集成。spring-doc.cadn.net.cn

所有支持的模型中API请求结构相同，但输出行为有所不同。spring-doc.cadn.net.cn

思维配置

为了支持任何支持的 Claude 模型进行思考，请在请求中包含以下配置：spring-doc.cadn.net.cn

所需配置

添加思维对象:spring-doc.cadn.net.cn
- “类型”：“启用”spring-doc.cadn.net.cn
- budget_tokens：推理的Tokens上限（建议从1024开始）spring-doc.cadn.net.cn
Tokens预算规则：spring-doc.cadn.net.cn
- budget_tokens通常必须小于max_tokensspring-doc.cadn.net.cn
- Claude 可能使用少于分配的Tokensspring-doc.cadn.net.cn
- 预算越大，推理越深入，但可能影响延迟spring-doc.cadn.net.cn
- 在使用交错思维工具（仅限 Claude 4）时，这一约束有所放宽，但尚未在 Spring AI 中得到支持。spring-doc.cadn.net.cn

主要考虑因素

Claude 3.7 在回复中返回了完整的思考内容spring-doc.cadn.net.cn
Claude 4返回模型内部推理的摘要版本，以降低延迟并保护敏感内容spring-doc.cadn.net.cn
思考Tokens作为输出Tokens的一部分可以计费（即使并非所有Tokens都能在响应中可见）spring-doc.cadn.net.cn
交错思维仅在Claude 4型号上可用，并且需要beta头交错思考-2025-05-14spring-doc.cadn.net.cn

工具集成与交错思维

Claude 4 模型支持工具使用交错思考，使模型能够在工具调用之间进行推理。spring-doc.cadn.net.cn

当前的Spring AI实现支持基础思维和工具使用分别，但尚未支持与工具使用交错思考（即思考在多个工具调用中持续进行）。spring-doc.cadn.net.cn

关于交错思维与工具使用的详细信息，请参见人类文献。spring-doc.cadn.net.cn

非流媒体示例

以下是使用 ChatClient API 在非流媒体请求中启用思考的方法：spring-doc.cadn.net.cn

ChatClient chatClient = ChatClient.create(chatModel);

// For Claude 3.7 Sonnet - explicit thinking configuration required
ChatResponse response = chatClient.prompt()
    .options(AnthropicChatOptions.builder()
        .model("claude-3-7-sonnet-latest")
        .temperature(1.0)  // Temperature should be set to 1 when thinking is enabled
        .maxTokens(8192)
        .thinking(AnthropicApi.ThinkingType.ENABLED, 2048)  // Must be ≥1024 && < max_tokens
        .build())
    .user("Are there an infinite number of prime numbers such that n mod 4 == 3?")
    .call()
    .chatResponse();

// For Claude 4 models - thinking is enabled by default
ChatResponse response4 = chatClient.prompt()
    .options(AnthropicChatOptions.builder()
        .model("claude-opus-4-0")
        .maxTokens(8192)
        // No explicit thinking configuration needed
        .build())
    .user("Are there an infinite number of prime numbers such that n mod 4 == 3?")
    .call()
    .chatResponse();

// Process the response which may contain thinking content
for (Generation generation : response.getResults()) {
    AssistantMessage message = generation.getOutput();
    if (message.getText() != null) {
        // Regular text response
        System.out.println("Text response: " + message.getText());
    }
    else if (message.getMetadata().containsKey("signature")) {
        // Thinking content
        System.out.println("Thinking: " + message.getMetadata().get("thinking"));
        System.out.println("Signature: " + message.getMetadata().get("signature"));
    }
}

流媒体示例

你也可以用思维方式进行流式反应：spring-doc.cadn.net.cn

ChatClient chatClient = ChatClient.create(chatModel);

// For Claude 3.7 Sonnet - explicit thinking configuration
Flux<ChatResponse> responseFlux = chatClient.prompt()
    .options(AnthropicChatOptions.builder()
        .model("claude-3-7-sonnet-latest")
        .temperature(1.0)
        .maxTokens(8192)
        .thinking(AnthropicApi.ThinkingType.ENABLED, 2048)
        .build())
    .user("Are there an infinite number of prime numbers such that n mod 4 == 3?")
    .stream();

// For Claude 4 models - thinking is enabled by default
Flux<ChatResponse> responseFlux4 = chatClient.prompt()
    .options(AnthropicChatOptions.builder()
        .model("claude-opus-4-0")
        .maxTokens(8192)
        // No explicit thinking configuration needed
        .build())
    .user("Are there an infinite number of prime numbers such that n mod 4 == 3?")
    .stream();

// For streaming, you might want to collect just the text responses
String textContent = responseFlux.collectList()
    .block()
    .stream()
    .map(ChatResponse::getResults)
    .flatMap(List::stream)
    .map(Generation::getOutput)
    .map(AssistantMessage::getText)
    .filter(text -> text != null && !text.isBlank())
    .collect(Collectors.joining());

工具使用集成

Claude 4 模型集成了思维和工具使用能力：spring-doc.cadn.net.cn

Claude 3.7 Sonnet：支持思考和工具使用，但它们分别运作，需要更明确的配置spring-doc.cadn.net.cn
Claude 4 模型：思维与工具使用自然交错，在工具交互中提供更深层次的推理spring-doc.cadn.net.cn

使用思维的好处

思维功能带来了多项好处：spring-doc.cadn.net.cn

透明度：了解模型的推理过程及其结论spring-doc.cadn.net.cn
调试：识别模型可能存在逻辑错误的地方spring-doc.cadn.net.cn
教育：将逐步推理作为教学工具spring-doc.cadn.net.cn
复杂问题解决：数学、逻辑和推理任务取得更好成绩spring-doc.cadn.net.cn

需要注意，实现思维需要更高的Tokens预算，因为思考过程本身会消耗你的Tokens配置。spring-doc.cadn.net.cn

工具/函数调用

你可以用人类聊天模型并让拟人 Claude 模型智能地选择输出包含参数的 JSON 对象，调用一个或多个注册函数。这是一种强大的技术，将LLM功能与外部工具和API连接起来。阅读更多关于工具调用的信息。spring-doc.cadn.net.cn

工具选择

这tool_choice参数允许你控制模型如何使用提供的工具。此功能为您提供对工具执行行为的细致控制。spring-doc.cadn.net.cn

完整的 API 细节请参见 Anthropic tool_choice 文档。spring-doc.cadn.net.cn

工具选择选项

Spring AI 通过AnthropicApi.ToolChoice接口：spring-doc.cadn.net.cn

工具选择自动（默认）：模型自动决定是使用工具还是用文本回应spring-doc.cadn.net.cn
工具选择任何模型必须至少使用其中一种可用工具spring-doc.cadn.net.cn
工具选择工具：模特必须按名称使用特定工具spring-doc.cadn.net.cn
工具选择无：模型不能使用任何工具spring-doc.cadn.net.cn

禁用并行工具使用

所有工具选择选项（除了工具选择无）支持disableParallelToolUse参数。当设置为true模型最多输出一次工具使用。spring-doc.cadn.net.cn

使用示例

自动模式（默认行为）

让模型决定是否使用工具：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "What's the weather in San Francisco?",
        AnthropicChatOptions.builder()
            .toolChoice(new AnthropicApi.ToolChoiceAuto())
            .toolCallbacks(weatherToolCallback)
            .build()
    )
);

力道工具使用（任何）

要求模型至少使用一种工具：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "What's the weather?",
        AnthropicChatOptions.builder()
            .toolChoice(new AnthropicApi.ToolChoiceAny())
            .toolCallbacks(weatherToolCallback, calculatorToolCallback)
            .build()
    )
);

特异力量工具

要求模型使用特定工具名称：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "What's the weather in San Francisco?",
        AnthropicChatOptions.builder()
            .toolChoice(new AnthropicApi.ToolChoiceTool("get_weather"))
            .toolCallbacks(weatherToolCallback, calculatorToolCallback)
            .build()
    )
);

禁用工具使用

防止模型使用任何工具：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "What's the weather in San Francisco?",
        AnthropicChatOptions.builder()
            .toolChoice(new AnthropicApi.ToolChoiceNone())
            .toolCallbacks(weatherToolCallback)
            .build()
    )
);

禁用并行工具使用

强制模型一次只使用一种工具：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "What's the weather in San Francisco and what's 2+2?",
        AnthropicChatOptions.builder()
            .toolChoice(new AnthropicApi.ToolChoiceAuto(true)) // disableParallelToolUse = true
            .toolCallbacks(weatherToolCallback, calculatorToolCallback)
            .build()
    )
);

使用 ChatClient API

你也可以使用流畅的ChatClient API来选择工具：spring-doc.cadn.net.cn

String response = ChatClient.create(chatModel)
    .prompt()
    .user("What's the weather in San Francisco?")
    .options(AnthropicChatOptions.builder()
        .toolChoice(new AnthropicApi.ToolChoiceTool("get_weather"))
        .build())
    .call()
    .content();

使用场景

验证：使用工具选择工具确保关键作调用特定工具spring-doc.cadn.net.cn
效率：使用工具选择任何当你知道必须使用工具来避免不必要的文本生成时spring-doc.cadn.net.cn
控制：使用工具选择无暂时禁用工具访问，同时保持工具定义的注册spring-doc.cadn.net.cn
顺序处理：使用disableParallelToolUse强制执行依赖作的顺序工具执行spring-doc.cadn.net.cn

模态

多模态指的是模型能够同时理解和处理来自多种来源的信息，包括文本、PDF、图片、数据格式。spring-doc.cadn.net.cn

图像

目前，Anthropic Claude 3 支持基地64来源类型图像，以及图片/jpeg,图片/PNG,图片/动图和image/webp媒体类型。更多信息请查看视力指南。拟人克劳德3.5十四行诗也支持：PDF格式来源类型申请表/PDF文件。spring-doc.cadn.net.cn

春季 AI消息界面通过引入媒体类型支持多模态 AI 模型。这种类型包含关于消息中媒体附件的数据和信息，使用 Spring 的org.springframework.util.MimeType以及一个java.lang.Object对于原始媒体数据。spring-doc.cadn.net.cn

下面是一个从AnthropicChatModelIT.java提取的简单代码示例，演示了用户文本与图片的组合。spring-doc.cadn.net.cn

var imageData = new ClassPathResource("/multimodal.test.png");

var userMessage = new UserMessage("Explain what do you see on this picture?",
        List.of(new Media(MimeTypeUtils.IMAGE_PNG, this.imageData)));

ChatResponse response = chatModel.call(new Prompt(List.of(this.userMessage)));

logger.info(response.getResult().getOutput().getContent());

它接收的输入是multimodal.test.png图像：spring-doc.cadn.net.cn

并配有短信“请解释你在这张图片上看到了什么？”，并生成类似这样的回复：spring-doc.cadn.net.cn

The image shows a close-up view of a wire fruit basket containing several pieces of fruit.
...

PDF格式

从 Sonnet 3.5 开始，提供 PDF 支持（测试版）。使用该申请表/PDF将PDF文件附加到消息后，媒体类型：spring-doc.cadn.net.cn

var pdfData = new ClassPathResource("/spring-ai-reference-overview.pdf");

var userMessage = new UserMessage(
        "You are a very professional document summarization specialist. Please summarize the given document.",
        List.of(new Media(new MimeType("application", "pdf"), pdfData)));

var response = this.chatModel.call(new Prompt(List.of(userMessage)));

引文

Anthropic的Citations API允许Claude在生成回复时引用所提供文档的具体部分。当提示词中包含引用文档时，Claude可以引用源材料，引用元数据（字符范围、页码或内容块）会在响应元数据中返回。spring-doc.cadn.net.cn

引用有助于改进：spring-doc.cadn.net.cn

准确性验证：用户可以将Claude的回答与来源材料进行核对spring-doc.cadn.net.cn
透明度：明确了解文件中哪些部分支持了回应spring-doc.cadn.net.cn
合规：满足受监管行业中来源归属的要求spring-doc.cadn.net.cn
信任：通过展示信息来源来建立信心spring-doc.cadn.net.cn

支持的型号spring-doc.cadn.net.cn

引用支持Claude 3.7 Sonnet和Claude 4型号（Opus和Sonnet）。spring-doc.cadn.net.cn

文档类型spring-doc.cadn.net.cn

支持三种类型的引用文件：spring-doc.cadn.net.cn

纯文本：带有字符级引用的文本内容spring-doc.cadn.net.cn
PDF：带有页面级引用的PDF文档spring-doc.cadn.net.cn
自定义内容：用户自定义内容块，带有区块级引用spring-doc.cadn.net.cn

创建引用文档

使用该引用文档构建器创建可引用的文档：spring-doc.cadn.net.cn

纯文本文件

CitationDocument document = CitationDocument.builder()
    .plainText("The Eiffel Tower was completed in 1889 in Paris, France. " +
               "It stands 330 meters tall and was designed by Gustave Eiffel.")
    .title("Eiffel Tower Facts")
    .citationsEnabled(true)
    .build();

PDF文档

// From file path
CitationDocument document = CitationDocument.builder()
    .pdfFile("path/to/document.pdf")
    .title("Technical Specification")
    .citationsEnabled(true)
    .build();

// From byte array
byte[] pdfBytes = loadPdfBytes();
CitationDocument document = CitationDocument.builder()
    .pdf(pdfBytes)
    .title("Product Manual")
    .citationsEnabled(true)
    .build();

自定义内容块

为了实现细致的引用控制，可以使用自定义内容块：spring-doc.cadn.net.cn

CitationDocument document = CitationDocument.builder()
    .customContent(
        "The Great Wall of China is approximately 21,196 kilometers long.",
        "It was built over many centuries, starting in the 7th century BC.",
        "The wall was constructed to protect Chinese states from invasions."
    )
    .title("Great Wall Facts")
    .citationsEnabled(true)
    .build();

请求中的引用

在聊天选项中包含引用文件：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(
    new Prompt(
        "When was the Eiffel Tower built and how tall is it?",
        AnthropicChatOptions.builder()
            .model("claude-3-7-sonnet-latest")
            .maxTokens(1024)
            .citationDocuments(document)
            .build()
    )
);

多重文件

你可以提供多份文件供Claude参考：spring-doc.cadn.net.cn

CitationDocument parisDoc = CitationDocument.builder()
    .plainText("Paris is the capital city of France with a population of 2.1 million.")
    .title("Paris Information")
    .citationsEnabled(true)
    .build();

CitationDocument eiffelDoc = CitationDocument.builder()
    .plainText("The Eiffel Tower was designed by Gustave Eiffel for the 1889 World's Fair.")
    .title("Eiffel Tower History")
    .citationsEnabled(true)
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        "What is the capital of France and who designed the Eiffel Tower?",
        AnthropicChatOptions.builder()
            .model("claude-3-7-sonnet-latest")
            .citationDocuments(parisDoc, eiffelDoc)
            .build()
    )
);

访问引用

引用信息在响应元数据中返回：spring-doc.cadn.net.cn

ChatResponse response = chatModel.call(prompt);

// Get citations from metadata
List<Citation> citations = (List<Citation>) response.getMetadata().get("citations");

// Optional: Get citation count directly from metadata
Integer citationCount = (Integer) response.getMetadata().get("citationCount");
System.out.println("Total citations: " + citationCount);

// Process each citation
for (Citation citation : citations) {
    System.out.println("Document: " + citation.getDocumentTitle());
    System.out.println("Location: " + citation.getLocationDescription());
    System.out.println("Cited text: " + citation.getCitedText());
    System.out.println("Document index: " + citation.getDocumentIndex());
    System.out.println();
}

引用类型

引用内容根据文档类型包含不同的位置信息：spring-doc.cadn.net.cn

字符位置（纯文本）

对于纯文本文档，引用包含字符索引：spring-doc.cadn.net.cn

Citation citation = citations.get(0);
if (citation.getType() == Citation.LocationType.CHAR_LOCATION) {
    int start = citation.getStartCharIndex();
    int end = citation.getEndCharIndex();
    String text = citation.getCitedText();
    System.out.println("Characters " + start + "-" + end + ": " + text);
}

页面位置（PDF）

对于PDF文档，引用包含页码：spring-doc.cadn.net.cn

Citation citation = citations.get(0);
if (citation.getType() == Citation.LocationType.PAGE_LOCATION) {
    int startPage = citation.getStartPageNumber();
    int endPage = citation.getEndPageNumber();
    System.out.println("Pages " + startPage + "-" + endPage);
}

内容块位置（自定义内容）

对于自定义内容，引用引用具体内容块：spring-doc.cadn.net.cn

Citation citation = citations.get(0);
if (citation.getType() == Citation.LocationType.CONTENT_BLOCK_LOCATION) {
    int startBlock = citation.getStartBlockIndex();
    int endBlock = citation.getEndBlockIndex();
    System.out.println("Content blocks " + startBlock + "-" + endBlock);
}

完整示例

这里有一个完整的示例，展示了引用的使用情况：spring-doc.cadn.net.cn

// Create a citation document
CitationDocument document = CitationDocument.builder()
    .plainText("Spring AI is an application framework for AI engineering. " +
               "It provides a Spring-friendly API for developing AI applications. " +
               "The framework includes abstractions for chat models, embedding models, " +
               "and vector databases.")
    .title("Spring AI Overview")
    .citationsEnabled(true)
    .build();

// Call the model with the document
ChatResponse response = chatModel.call(
    new Prompt(
        "What is Spring AI?",
        AnthropicChatOptions.builder()
            .model("claude-3-7-sonnet-latest")
            .maxTokens(1024)
            .citationDocuments(document)
            .build()
    )
);

// Display the response
System.out.println("Response: " + response.getResult().getOutput().getText());
System.out.println("\nCitations:");

// Process citations
List<Citation> citations = (List<Citation>) response.getMetadata().get("citations");

if (citations != null && !citations.isEmpty()) {
    for (int i = 0; i < citations.size(); i++) {
        Citation citation = citations.get(i);
        System.out.println("\n[" + (i + 1) + "] " + citation.getDocumentTitle());
        System.out.println("    Location: " + citation.getLocationDescription());
        System.out.println("    Text: " + citation.getCitedText());
    }
} else {
    System.out.println("No citations were provided in the response.");
}

最佳实践

使用描述性标题：为引用文档提供有意义的标题，帮助用户识别引用中的来源。spring-doc.cadn.net.cn
检查无引用：并非所有回复都包含引用，因此访问前务必验证引用元数据的存在。spring-doc.cadn.net.cn
考虑文档大小：文档越大，提供更多上下文，但消耗更多输入Tokens，可能影响响应时间。spring-doc.cadn.net.cn
利用多份文件：在回答跨多个来源的问题时，应一次性提供所有相关文件，而不是多次拨打电话。spring-doc.cadn.net.cn
使用合适的文档类型：简单内容选择纯文本，现有文档选择PDF，需要细致控制引用细节时使用自定义内容块。spring-doc.cadn.net.cn

实际应用场景

法律文件分析

在保持来源归属的同时分析合同和法律文件：spring-doc.cadn.net.cn

CitationDocument contract = CitationDocument.builder()
    .pdfFile("merger-agreement.pdf")
    .title("Merger Agreement 2024")
    .citationsEnabled(true)
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        "What are the key termination clauses in this contract?",
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .maxTokens(2000)
            .citationDocuments(contract)
            .build()
    )
);

// Citations will reference specific pages in the PDF

客户支持知识库

提供准确的客户支持答案，并附有可验证的来源：spring-doc.cadn.net.cn

CitationDocument kbArticle1 = CitationDocument.builder()
    .plainText(loadKnowledgeBaseArticle("authentication"))
    .title("Authentication Guide")
    .citationsEnabled(true)
    .build();

CitationDocument kbArticle2 = CitationDocument.builder()
    .plainText(loadKnowledgeBaseArticle("billing"))
    .title("Billing FAQ")
    .citationsEnabled(true)
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        "How do I reset my password and update my billing information?",
        AnthropicChatOptions.builder()
            .model("claude-3-7-sonnet-latest")
            .citationDocuments(kbArticle1, kbArticle2)
            .build()
    )
);

// Citations show which KB articles were referenced

研究与合规

生成需要来源引用以符合规定的报告：spring-doc.cadn.net.cn

CitationDocument clinicalStudy = CitationDocument.builder()
    .pdfFile("clinical-trial-results.pdf")
    .title("Clinical Trial Phase III Results")
    .citationsEnabled(true)
    .build();

CitationDocument regulatoryGuidance = CitationDocument.builder()
    .plainText(loadRegulatoryDocument())
    .title("FDA Guidance Document")
    .citationsEnabled(true)
    .build();

ChatResponse response = chatModel.call(
    new Prompt(
        "Summarize the efficacy findings and regulatory implications.",
        AnthropicChatOptions.builder()
            .model("claude-sonnet-4")
            .maxTokens(3000)
            .citationDocuments(clinicalStudy, regulatoryGuidance)
            .build()
    )
);

// Citations provide audit trail for compliance

引用文档选项

上下文场

可选地提供文档的背景信息，这些背景不会被引用，但可以指导Claude的理解：spring-doc.cadn.net.cn

CitationDocument document = CitationDocument.builder()
    .plainText("...")
    .title("Legal Contract")
    .context("This is a merger agreement dated January 2024 between Company A and Company B")
    .build();

控制引用

默认情况下，所有文档的引用功能都被禁用（选择加入行为）。要启用引用，显式设置引用启用（true）:spring-doc.cadn.net.cn

CitationDocument document = CitationDocument.builder()
    .plainText("The Eiffel Tower was completed in 1889...")
    .title("Historical Facts")
    .citationsEnabled(true)  // Explicitly enable citations for this document
    .build();

你也可以提供无引用的文件作为背景背景：spring-doc.cadn.net.cn

CitationDocument backgroundDoc = CitationDocument.builder()
    .plainText("Background information about the industry...")
    .title("Context Document")
    // citationsEnabled defaults to false - Claude will use this but not cite it
    .build();

Anthropic要求请求中所有文档的引用设置保持一致。你不能在同一请求中同时使用启用引用和禁用引用的文档。spring-doc.cadn.net.cn

采样控制器

创建一个新的 Spring Boot 项目并添加春-AI-入门-模型-拟人对你的POM（或Gradle）依赖。spring-doc.cadn.net.cn

添加一个application.properties文件，在src/主/资源目录，用于启用和配置Anthropic聊天模型：spring-doc.cadn.net.cn

spring.ai.anthropic.api-key=YOUR_API_KEY
spring.ai.anthropic.chat.options.model=claude-3-5-sonnet-latest
spring.ai.anthropic.chat.options.temperature=0.7
spring.ai.anthropic.chat.options.max-tokens=450

替换API密钥凭你的人类身份。

这将产生人类聊天模型你可以把这些实现注入到你的类里。这里有一个简单的例子@Controller使用聊天模式生成文本的课程。spring-doc.cadn.net.cn

@RestController
public class ChatController {

    private final AnthropicChatModel chatModel;

    @Autowired
    public ChatController(AnthropicChatModel chatModel) {
        this.chatModel = chatModel;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }

    @GetMapping("/ai/generateStream")
	public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return this.chatModel.stream(prompt);
    }
}

手动配置

AnthropicChatModel 实现了聊天模型和StreamingChatModel并使用低级 AnthropicApi 客户端连接 Anthropic 服务。spring-doc.cadn.net.cn

添加春艾人类对你项目Maven的依赖pom.xml文件：spring-doc.cadn.net.cn

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-anthropic</artifactId>
</dependency>

或者去你的Gradlebuild.gradle构建文件。spring-doc.cadn.net.cn

dependencies {
    implementation 'org.springframework.ai:spring-ai-anthropic'
}

请参考依赖管理部分，将Spring AI的物料清单添加到你的构建文件中。

接下来，创建一个人类聊天模型并用于文本生成：spring-doc.cadn.net.cn

var anthropicApi = new AnthropicApi(System.getenv("ANTHROPIC_API_KEY"));
var anthropicChatOptions = AnthropicChatOptions.builder()
            .model("claude-3-7-sonnet-20250219")
            .temperature(0.4)
            .maxTokens(200)
        .build()
var chatModel = AnthropicChatModel.builder().anthropicApi(anthropicApi)
                .defaultOptions(anthropicChatOptions).build();

ChatResponse response = this.chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

// Or with streaming responses
Flux<ChatResponse> response = this.chatModel.stream(
    new Prompt("Generate the names of 5 famous pirates."));

这人类聊天选项提供聊天请求的配置信息。这AnthropicChatOptions.Builder是流利期权构建器。spring-doc.cadn.net.cn

低级 AnthropicApi 客户端

AnthropicApi 提供了一个轻量级的 Anthropic 消息 API 客户端。spring-doc.cadn.net.cn

以下类图展示了人类Api聊天界面与构建模块：spring-doc.cadn.net.cn

这里有一个简单的示例，说明如何程序化使用该 API：spring-doc.cadn.net.cn

AnthropicApi anthropicApi =
    new AnthropicApi(System.getenv("ANTHROPIC_API_KEY"));

AnthropicMessage chatCompletionMessage = new AnthropicMessage(
        List.of(new ContentBlock("Tell me a Joke?")), Role.USER);

// Sync request
ResponseEntity<ChatCompletionResponse> response = this.anthropicApi
    .chatCompletionEntity(new ChatCompletionRequest(AnthropicApi.ChatModel.CLAUDE_3_OPUS.getValue(),
            List.of(this.chatCompletionMessage), null, 100, 0.8, false));

// Streaming request
Flux<StreamResponse> response = this.anthropicApi
    .chatCompletionStream(new ChatCompletionRequest(AnthropicApi.ChatModel.CLAUDE_3_OPUS.getValue(),
            List.of(this.chatCompletionMessage), null, 100, 0.8, true));

更多信息请关注AnthropicApi.java的JavaDoc。spring-doc.cadn.net.cn

底层API示例

AnthropicApiIT.java测试提供了一些使用轻量级库的通用示例。spring-doc.cadn.net.cn

人类对话

前提条件

添加仓库和物料清单

自动配置

聊天属性

重试属性

连接性质

配置属性

运行时选项

提示缓存

缓存策略

启用提示缓存

仅系统缓存

仅工具缓存

系统与工具缓存

对话历史缓存

使用 ChatClient Fluent API

高级缓存选项

每条消息TTL（5分钟或1小时）

缓存资格筛选器

使用示例

Tokens使用跟踪

实际应用场景

法律文件分析

批处理代码审查

多租户SaaS与共享工具

带知识库的客户支持

最佳实践

实现细节

未来改进

思维

思维配置

所需配置

主要考虑因素

工具集成与交错思维

非流媒体示例

流媒体示例

工具使用集成

使用思维的好处

工具/函数调用

工具选择

工具选择选项

禁用并行工具使用

使用示例

自动模式（默认行为）

力道工具使用（任何）

特异力量工具

禁用工具使用

禁用并行工具使用

使用 ChatClient API

使用场景

模 态

图像

PDF格式

引文

创建引用文档

纯文本文件

PDF文档

自定义内容块

请求中的引用

多重文件

访问引用

引用类型

字符位置（纯文本）

页面位置（PDF）

内容块位置（自定义内容）

完整示例

最佳实践

实际应用场景

法律文件分析

客户支持知识库

研究与合规

引用文档选项

上下文场

控制引用

采样控制器

手动配置

低级 AnthropicApi 客户端

底层API示例

模态