12 February, 2026

GitHub Copilot SDK 入门：五分钟构建你的第一个 AI Agent

笔记 AI

TL;DR

GitHub Copilot SDK 的核心价值不是“调用 LLM”的便利性（这已经被 OpenAI SDK、LangChain 等解决），而是提供了一个经过生产验证的 Agent 运行时。

它解决的真正问题是：

编排复杂性：规划器、工具路由、状态管理已内置
稳定性：数百万开发者日常使用的可靠性保障
演进能力：新模型、新工具能力由 CLI 自动更新

当你开始构建下一个 AI 应用时，问自己两个问题：

我的核心价值在哪里？ 如果是业务逻辑和工具定义，使用 SDK；如果是底层编排创新，自建框架。
多久能到生产环境？ SDK 让你跳过 80% 的基础设施工作，专注于最后 20% 的差异化能力。

Agent 开发的门槛已经降低，但真正的挑战在于：定义有价值的工具，设计流畅的交互，解决真实的问题。技术不再是壁垒，想象力才是。

引言：为什么 Agent 开发不再是少数人的游戏

2026 年 1 月，GitHub 发布了 Copilot SDK，这标志着 AI Agent 开发从“专家领域”走向“大众工具”的关键转折。

在此之前，构建一个能够自主规划、调用工具、编辑文件的 AI Agent，你需要：

选择并集成 LLM 服务（OpenAI、Anthropic、Azure…）
自建 Agent 编排器（规划器、工具路由、状态管理）
处理流式输出、错误重试、上下文管理
实现工具定义标准（function calling schema）

这套流程复杂且脆弱，开源框架（LangChain、AutoGPT）虽降低门槛，但仍需深入理解 Agent 运行机制。真正的转折点在于：GitHub 将 Copilot CLI 的生产级 Agent 运行时开放为 SDK。

这意味着什么？你可以用 5 行代码启动一个完整的 Agent 运行时：

import asyncio
from copilot import CopilotClient

async def main():
    client = CopilotClient()
    await client.start()
    session = await client.create_session({"model": "gpt-4.1"})
    response = await session.send_and_wait({"prompt": "解释下量子纠缠"})
    print(response.data.content)

asyncio.run(main())

无需关心模型接入、Prompt 工程、响应解析——这些 Copilot CLI 已在数百万开发者的实战中验证过。你只需定义业务逻辑，SDK 处理剩下的一切。

本文目标：通过一个完整的天气助手案例，带你理解：

SDK 如何与 CLI 通信（架构本质）
工具调用机制如何工作（LLM 如何 " 决定 " 调用你的代码）
从玩具到工具的关键跃迁点（流式响应、事件监听、状态管理）

无论你是想快速验证 AI 应用 idea，还是为企业构建定制化 Agent，这篇文章都是起点。

准备工作：环境搭建

在开始编码前，确保你的开发环境满足以下条件。

前置条件清单

1. 安装 GitHub Copilot CLI

SDK 本身不包含 AI 推理能力，它通过 JSON-RPC 与 Copilot CLI 通信。CLI 是真正的 " 引擎 “，SDK 是 " 方向盘 “。

# macOS/Linux
brew install copilot-cli

# Verify installation
copilot --version

2. 认证 GitHub 账号

copilot login

你需要一个 GitHub Copilot 订阅（个人版或企业版）。如果使用 BYOK（Bring Your Own Key）模式，可跳过此步骤。

验证环境

运行以下命令确认 CLI 工作正常：

copilot -p "Explain recursion in one sentence"

如果看到 AI 的回答，说明环境就绪。

第一步：发送你的第一条消息

安装 SDK

创建项目目录并安装 Python SDK：

mkdir copilot-demo && cd copilot-demo
# working in virtual env
python -m venv venv && source venv/bin/activate
pip install github-copilot-sdk

最简代码示例

创建 main.py：

import asyncio
from copilot import CopilotClient

async def main():
    client = CopilotClient()
    await client.start()
    
    session = await client.create_session({"model": "gpt-4.1"})
    response = await session.send_and_wait({"prompt": "量子纠缠是什么？"})
    
    print(response.data.content)
    
    await client.stop()

asyncio.run(main())

运行：

python main.py

你会看到 AI 的完整回答。仅用 9 行代码，一个完整的 AI 对话就完成了。

执行流程解析

这段代码背后发生了什么？

1. client.start()     → SDK 启动 Copilot CLI 进程（在后台运行）
2. create_session()   → 通过 JSON-RPC 请求 CLI 创建会话
3. send_and_wait()    → 发送提示词，CLI 转发给 LLM
4. LLM 推理          → 响应通过 CLI 返回给 SDK
5. response.data      → SDK 解析 JSON 响应，提取内容

架构本质：SDK 是 CLI 的“遥控器”

GitHub 的设计哲学是关注点分离：

组件	职责
Copilot CLI	智能体运行时 (规划、工具调用、LLM 通信)
SDK	进程管理、JSON-RPC 包装器、事件监听
你的代码	进程管理、JSON-RPC 包装器、事件监听

这种架构的优势：

CLI 可独立升级：新的模型、工具能力无需修改 SDK
多语言支持成本低：各语言 SDK 只需实现 JSON-RPC 客户端
调试友好：CLI 可独立运行，便于观察日志和排查问题

JSON-RPC 通信示例

当你调用 send_and_wait() 时，SDK 实际发送的请求：

{
  "jsonrpc": "2.0",
  "method": "session.send",
  "params": {
    "sessionId": "abc123",
    "prompt": "What is quantum entanglement?"
  },
  "id": 1
}

CLI 响应：

{
  "jsonrpc": "2.0",
  "result": {
    "data": {
      "content": "Quantum entanglement refers to a phenomenon where two or more quantum systems..."
    }
  },
  "id": 1
}

理解这一点很重要：SDK 不是“调用 LLM”，而是“调用 CLI”。CLI 已经封装了所有复杂性。

第二步：让 AI 实时响应——流式输出

为什么需要流式响应

使用 send_and_wait() 时，你必须等待 LLM 生成完整回答后才能看到任何输出。对于长文本生成（如代码解释、文档编写），用户可能等待 10-30 秒看到空白屏幕。

流式响应让 AI 像打字机一样逐字输出，提升用户体验的同时，也能提前发现模型是否“跑偏”。

事件监听机制

修改 main.py，启用流式输出：

import asyncio
import sys
from copilot import CopilotClient
from copilot.generated.session_events import SessionEventType

async def main():
    client = CopilotClient()
    await client.start()
    
    session = await client.create_session({
        "model": "gpt-4.1",
        "streaming": True,  # Enable streaming mode
    })
    
    # Listen for response deltas
    def handle_event(event):
        if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
            sys.stdout.write(event.data.delta_content)
            sys.stdout.flush()
        if event.type == SessionEventType.SESSION_IDLE:
            print()  # Newline when complete
    
    session.on(handle_event)
    
    await session.send_and_wait({"prompt": "Write a code example of quicksort"})

    await client.stop()

asyncio.run(main())

运行后，你会看到结果逐渐“流出”，而不是一次性显示。

事件驱动模型的设计哲学

SDK 使用观察者模式处理 CLI 的异步事件流：

CLI generates events → SDK parses → Dispatches to listeners → Your handle_event() executes

主要事件类型：

事件	触发时机	典型用途
`ASSISTANT_MESSAGE_DELTA`	人工智能生成部分内容	实时显示
`ASSISTANT_MESSAGE`	AI 完成完整信息	获取最终内容
`SESSION_IDLE`	会话进入空闲状态	标记任务完成
`TOOL_CALL`	人工智能决定调用工具	日志记录、权限检查

代码对比：同步 vs 流式

同步模式，适合短回答的场景：

response = await session.send_and_wait({"prompt": "1+1=?"})
print(response.data.content)  # Wait and print all at once

流式模式，适合长文本场景：

session.on(lambda event: 
    print(event.data.delta_content, end="") 
    if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA 
    else None
)
await session.send_and_wait({"prompt": "Write an article"})

背后的技术细节

流式响应基于 Server-Sent Events (SSE) 或 WebSocket：

CLI 从 LLM 接收 token 流
每收到一个 token，CLI 向 SDK 发送一个 message_delta 事件
SDK 触发你的事件监听器
用户立即看到新内容

这种设计让你的应用感知 AI 的“思考过程”，而不仅仅是最终结果。

第三步：赋予 AI 能力——自定义工具

工具的本质：让 LLM 调用你的代码

到目前为止，AI 只能 “说话”，无法与外部世界交互。工具（Tools） 是 Agent 的核心能力：你定义函数，AI 决定何时调用。

举个例子：

用户：“北京今天天气怎么样？”
AI 思考：我需要天气数据 → 调用 get_weather(“Beijing”)
你的代码：返回 {“temperature”: “15°C”, “condition”: “sunny”}
AI 合成：“北京今天晴朗，15°C。”

关键点：AI 自主决定是否调用工具，以及传递什么参数。

工具定义三要素

一个工具包含：

描述（description）：告诉 AI 这个工具的用途
参数模式（parameters）：定义输入参数的结构（使用 Pydantic）
处理器（handler）：实际执行的 Python 函数

完整天气助手示例

创建 weather_assistant.py：

import asyncio
import random
import sys
from copilot import CopilotClient
from copilot.tools import define_tool
from copilot.generated.session_events import SessionEventType
from pydantic import BaseModel, Field

# 1. Define parameter schema
class GetWeatherParams(BaseModel):
    city: str = Field(description="City name, e.g., Beijing, Shanghai")

# 2. Define tool (description + handler)
@define_tool(description="Get current weather for a specified city")
async def get_weather(params: GetWeatherParams) -> dict:
    city = params.city
    
    # In production, call a real weather API here
    # Using mock data for demonstration
    conditions = ["sunny", "cloudy", "rainy", "overcast"]
    temp = random.randint(10, 30)
    condition = random.choice(conditions)
    
    return {
        "city": city,
        "temperature": f"{temp}°C",
        "condition": condition
    }

async def main():
    client = CopilotClient()
    await client.start()
    
    # 3. Pass tools to session
    session = await client.create_session({
        "model": "gpt-4.1",
        "streaming": True,
        "tools": [get_weather],  # Register tool
    })
    
    # Listen for streaming responses
    def handle_event(event):
        if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
            sys.stdout.write(event.data.delta_content)
            sys.stdout.flush()
        if event.type == SessionEventType.SESSION_IDLE:
            print("\n")
    
    session.on(handle_event)
    
    # Send a prompt that requires tool calls
    await session.send_and_wait({
        "prompt": "What's the weather like in Beijing and Shanghai? Compare them."
    })
    
    await client.stop()

asyncio.run(main())

运行：

python weather_assistant.py

执行流程详解

当你问“What’s the weather in Beijing and Shanghai”时：

AI 分析问题 → 需要天气数据
AI 检查可用工具 → 找到 get_weather 函数
AI 决定调用 → get_weather(city=“Beijing”)
SDK 触发处理程序 → 您的函数返回 {“temperature”: “22°C”, …}
AI 收到结果 → 再次调用 get_weather(city=“Shanghai”)
AI 合成答案 → " 北京晴，气温 22°C；上海阴，气温 18°C…”

AI 会自动调用工具多次（北京一次、上海一次），你无需编写任何循环逻辑。

参数模式的重要性

为什么用 Pydantic 定义参数？

class GetWeatherParams(BaseModel):
    city: str = Field(description="City name")
    unit: str = Field(default="celsius", description="Temperature unit: celsius or fahrenheit")

SDK 会将这个模式转换为 JSON Schema，传递给 LLM：

{
  "type": "object",
  "properties": {
    "city": {"type": "string", "description": "City name"},
    "unit": {"type": "string", "description": "Temperature unit"}
  },
  "required": ["city"]
}

LLM 根据这个 schema 提取参数。因此，描述越清晰，AI 调用越准确。

第四步：构建交互式助手

现在把所有能力组合起来：流式输出 + 工具调用 + 命令行交互。

完整可运行代码

创建 interactive_assistant.py：

import asyncio
import random
import sys
from copilot import CopilotClient
from copilot.tools import define_tool
from copilot.generated.session_events import SessionEventType
from pydantic import BaseModel, Field

# Define tools
class GetWeatherParams(BaseModel):
    city: str = Field(description="City name, e.g., Beijing, Shanghai, Guangzhou")

@define_tool(description="Get current weather for a specified city")
async def get_weather(params: GetWeatherParams) -> dict:
    city = params.city
    conditions = ["sunny", "cloudy", "rainy", "overcast", "hazy"]
    temp = random.randint(5, 35)
    condition = random.choice(conditions)
    humidity = random.randint(30, 90)
    
    return {
        "city": city,
        "temperature": f"{temp}°C",
        "condition": condition,
        "humidity": f"{humidity}%"
    }

async def main():
    client = CopilotClient()
    await client.start()
    
    session = await client.create_session({
        "model": "gpt-4.1",
        "streaming": True,
        "tools": [get_weather],
    })
    
    # Event listeners
    def handle_event(event):
        if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
            sys.stdout.write(event.data.delta_content)
            sys.stdout.flush()
        if event.type == SessionEventType.SESSION_IDLE:
            print()  # Newline when complete
    
    session.on(handle_event)
    
    # Interactive conversation loop
    print("🌤️  Weather Assistant (type 'exit' to quit)")
    print("Try: 'What's the weather in Beijing?' or 'Compare weather in Guangzhou and Shenzhen'\n")
    
    while True:
        try:
            user_input = input("You: ")
        except EOFError:
            break
        
        if user_input.lower() in ["exit", "quit"]:
            break
        
        if not user_input.strip():
            continue
        
        sys.stdout.write("Assistant: ")
        await session.send_and_wait({"prompt": user_input})
        print()  # Extra newline
    
    await client.stop()
    print("Goodbye!")

asyncio.run(main())

运行效果

python interactive_assistant.py

示例对话：

🌤️  Weather Assistant (type 'exit' to quit)
Try: 'What's the weather in Beijing?' or 'Compare weather in Guangzhou and Shenzhen'

You: Compare weather in Guangzhou and Shenzhen
Assistant: Guangzhou: 21°C, sunny, 84% humidity.
Shenzhen: 33°C, hazy, 77% humidity.
Shenzhen is significantly warmer and hazier, while Guangzhou is cooler and sunnier with slightly higher humidity.

You: What's the weather in Shenzhen
Assistant: The weather in Shenzhen is 8°C, overcast, with 47% humidity.

You: quit
Goodbye!

关键设计点

1. 会话持久化

注意我们只创建了一次 session，整个对话循环中持续复用。这意味着：

AI 记住之前的对话内容
可以追问 “What about tomorrow?"（AI 知道你指的是哪个城市）
工具调用历史也被保留

2. 异步 I/O 的正确姿势

# Using input() in while True loop
user_input = input("You: ")  # Synchronous blocking, but acceptable here

# send_and_wait() is async
await session.send_and_wait({"prompt": user_input})

为什么 input() 的阻塞可以接受？因为我们在等待用户输入，而不是等待 I/O 操作。真正的异步发生在与 CLI 通信时。

3. 优雅退出

try:
    user_input = input("You: ")
except EOFError:  # Catch Ctrl+D
    break

处理 EOFError 和常见退出命令（exit、quit），确保用户体验流畅。

扩展思路

基于这个框架，你可以快速扩展功能：

添加更多工具：

@define_tool(description="Query real-time stock price")
async def get_stock_price(params): ...

@define_tool(description="Search information on the web")
async def web_search(params): ...

session = await client.create_session({
    "tools": [get_weather, get_stock_price, web_search],
})

AI 会根据用户问题自动选择合适的工具。

添加系统提示词：

session = await client.create_session({
    "model": "gpt-4.1",
    "tools": [get_weather],
    "system_message": {
        "content": "You are a professional weather assistant. Keep answers concise but informative."
    }
})

记录工具调用日志：

def handle_event(event):
    if event.type == SessionEventType.TOOL_CALL:
        print(f"\n[Debug] AI called tool: {event.data.tool_name}")
        print(f"[Debug] Arguments: {event.data.arguments}\n")

调试技巧

开发过程中，观察 CLI 的日志对于理解 Agent 的行为非常重要。

启动独立的 CLI Server：

# 启动调试模式的 CLI Server
copilot --headless --log-level debug --port 9999

# 可选：指定日志目录
copilot --headless --log-level debug --port 9999 --log-dir ./logs

在代码中连接到 Server：

client = CopilotClient({
    'cli_url': 'http://localhost:9999',
})
await client.start()  # 不会启动新进程，直接连接到已有 Server

查看日志：

默认情况下，日志保存在 ~/.copilot/logs/ 目录中，每个 Server 进程都有独立的日志文件。使用 tail -f 实时观察：

tail -f ~/.copilot/logs/process-<timestamp>-<pid>.log

调试工具调用：

def handle_event(event):
    # 工具调用开始
    if event.type == SessionEventType.TOOL_USER_REQUESTED:
        print(f"[Tool Call] {event.data.tool_name}")
        print(f"Arguments: {event.data.arguments}")
    
    # 工具执行结果
    if event.type == SessionEventType.TOOL_EXECUTION_COMPLETE:
        print(f"[Tool Result] {event.data.tool_name}")
        print(f"Result: {event.data.result}")
    
    # AI 的最终响应
    if event.type == SessionEventType.ASSISTANT_MESSAGE:
        print(f"[Assistant] {event.data.content[:100]}...")

session.on(handle_event)

这种模式让你清晰地看到整个工具调用链路：AI 决策 → 工具执行 → 结果返回 → 最终响应。