Skip to content

令牌缓存和成本优化

¥Token Caching and Cost Optimization

使用 API 密钥认证(Gemini API 密钥或 Vertex AI)时,Gemini CLI 会通过令牌缓存自动优化 API 成本。此功能会复用之前的系统指令和上下文,以减少后续请求中处理的令牌数量。

¥Gemini CLI automatically optimizes API costs through token caching when using API key authentication (Gemini API key or Vertex AI). This feature reuses previous system instructions and context to reduce the number of tokens processed in subsequent requests.

令牌缓存适用于:

¥Token caching is available for:

  • API 密钥用户(Gemini API 密钥)

    ¥API key users (Gemini API key)

  • Vertex AI 用户(已设置项目和位置)

    ¥Vertex AI users (with project and location setup)

令牌缓存不适用于:

¥Token caching is not available for:

  • OAuth 用户(Google 个人/企业帐户)- Code Assist API 目前不支持缓存内容创建

    ¥OAuth users (Google Personal/Enterprise accounts) - the Code Assist API does not support cached content creation at this time

您可以使用以下方式查看令牌使用情况和缓存令牌节省量/stats命令。当缓存的令牌可用时,它们将显示在统计信息输出中。

¥You can view your token usage and cached token savings using the /stats command. When cached tokens are available, they will be displayed in the stats output.