Skip to content

Gemini CLI 架构概述

¥Gemini CLI Architecture Overview

本文档提供了 Gemini CLI 架构的高级概述。

¥This document provides a high-level overview of the Gemini CLI's architecture.

核心组件

¥Core components

Gemini CLI 主要由两个主要包以及系统在处理命令行输入过程中可以使用的一套工具组成:

¥The Gemini CLI is primarily composed of two main packages, along with a suite of tools that can be used by the system in the course of handling command-line input:

  1. CLI 包 (packages/cli):

    ¥CLI package (packages/cli):

    • 目的:这包含 Gemini CLI 面向用户的部分,例如处理初始用户输入、呈现最终输出以及管理整体用户体验。

      ¥Purpose: This contains the user-facing portion of the Gemini CLI, such as handling the initial user input, presenting the final output, and managing the overall user experience.

    • 该软件包包含的关键功能:

      ¥Key functions contained in the package:

    • 输入处理

      ¥Input processing

    • 历史管理

      ¥History management

    • 显示渲染

      ¥Display rendering

    • 主题和 UI 定制

      ¥Theme and UI customization

    • CLI 配置设置

      ¥CLI configuration settings

  2. 核心包(packages/core):

    ¥Core package (packages/core):

    • 目的:它充当 Gemini CLI 的后端。它接收来自packages/cli,协调与 Gemini API 的交互,并管理可用工具的执行。

      ¥Purpose: This acts as the backend for the Gemini CLI. It receives requests sent from packages/cli, orchestrates interactions with the Gemini API, and manages the execution of available tools.

    • 该软件包包含的关键功能:

      ¥Key functions contained in the package:

    • 用于与 Google Gemini API 通信的 API 客户端

      ¥API client for communicating with the Google Gemini API

    • 及时施工和管理

      ¥Prompt construction and management

    • 工具注册和执行逻辑

      ¥Tool registration and execution logic

    • 对话或会话的状态管理

      ¥State management for conversations or sessions

    • 服务器端配置

      ¥Server-side configuration

  3. 工具 (packages/core/src/tools/):

    ¥Tools (packages/core/src/tools/):

    • 目的:这些是扩展 Gemini 模型功能的单独模块,允许它与本地环境进行交互(例如,文件系统、shell 命令、Web 获取)。

      ¥Purpose: These are individual modules that extend the capabilities of the Gemini model, allowing it to interact with the local environment (e.g., file system, shell commands, web fetching).

    • 相互作用: packages/core根据 Gemini 模型的请求调用这些工具。

      ¥Interaction:packages/core invokes these tools based on requests from the Gemini model.

交互流程

¥Interaction Flow

与 Gemini CLI 的典型交互遵循以下流程:

¥A typical interaction with the Gemini CLI follows this flow:

  1. 用户输入:用户在终端中输入提示或命令,由packages/cli

    ¥User input: The user types a prompt or command into the terminal, which is managed by packages/cli.

  2. 向核心请求: packages/cli将用户的输入发送到packages/core

    ¥Request to core:packages/cli sends the user's input to packages/core.

  3. 请求已处理:核心包:

    ¥Request processed: The core package:

    • 为 Gemini API 构建适当的提示,可能包括对话历史记录和可用的工具定义。

      ¥Constructs an appropriate prompt for the Gemini API, possibly including conversation history and available tool definitions.

    • 将提示发送到 Gemini API。

      ¥Sends the prompt to the Gemini API.

  4. Gemini API 响应:Gemini API 处理该请求并返回响应。该响应可能是直接回复,也可能是使用某个可用工具的请求。

    ¥Gemini API response: The Gemini API processes the prompt and returns a response. This response might be a direct answer or a request to use one of the available tools.

  5. 工具执行(如果适用):

    ¥Tool execution (if applicable):

    • 当 Gemini API 请求一个工具时,核心包准备执行它。

      ¥When the Gemini API requests a tool, the core package prepares to execute it.

    • 如果请求的工具可以修改文件系统或执行 shell 命令,则首先向用户提供该工具及其参数的详细信息,并且用户必须批准执行。

      ¥If the requested tool can modify the file system or execute shell commands, the user is first given details of the tool and its arguments, and the user must approve the execution.

    • 只读操作(例如读取文件)可能不需要明确的用户确认即可继续。

      ¥Read-only operations, such as reading files, might not require explicit user confirmation to proceed.

    • 一旦确认,或者不需要确认,核心包就会在相关工具内执行相关操作,并将结果由核心包返回给 Gemini API。

      ¥Once confirmed, or if confirmation is not required, the core package executes the relevant action within the relevant tool, and the result is sent back to the Gemini API by the core package.

    • Gemini API 处理工具结果并生成最终响应。

      ¥The Gemini API processes the tool result and generates a final response.

  6. 对 CLI 的回应:核心包将最终响应发送回 CLI 包。

    ¥Response to CLI: The core package sends the final response back to the CLI package.

  7. 显示给用户:CLI 包格式化并在终端中向用户显示响应。

    ¥Display to user: The CLI package formats and displays the response to the user in the terminal.

关键设计原则

¥Key Design Principles

  • 模块化:将 CLI(前端)与核心(后端)分离可以实现独立开发和潜在的未来扩展(例如,同一后端的不同前端)。

    ¥Modularity: Separating the CLI (frontend) from the Core (backend) allows for independent development and potential future extensions (e.g., different frontends for the same backend).

  • 可扩展性:该工具系统设计为可扩展的,允许添加新功能。

    ¥Extensibility: The tool system is designed to be extensible, allowing new capabilities to be added.

  • 用户体验:CLI 专注于提供丰富且交互式的终端体验。

    ¥User experience: The CLI focuses on providing a rich and interactive terminal experience.