文本大模型的输出格式

最后发布时间 : 2026-02-06 10:11:18 浏览量 :
curl --location "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions" \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-plus",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "你是谁?"
        }
    ],
    "stream":true,
    "stream_options": {
        "include_usage": true
    }
}'
data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[{"finish_reason":null,"logprobs":null,"delta":{"content":"你好"},"index":0}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[{"delta":{"content":"!我是通"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[{"delta":{"content":"义千问"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[{"delta":{"content":"有任何问题或需要帮助,"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[{"delta":{"content":"欢迎随时告诉我!😊"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[{"finish_reason":"stop","delta":{"content":""},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":22,"completion_tokens":66,"total_tokens":88,"prompt_tokens_details":{"cached_tokens":0}},"created":1770343445,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-f37af769-6ca0-9b36-83a2-177bce0afe22"}

data: [DONE]

生信小木屋

好问题,这段 SSE 流式返回 里,确实混着 LLM 真正生成的内容API 厂商为了协议/统计/控制加的字段。我给你按“是不是 LLM 输出”拆清楚 👇

一句话结论先给你

👉 只有 choices[].delta.content 里的字符串,才是 LLM 真正“说的话”
👉 其他字段(rolefinish_reasonusageidcreatedmodelobject 等)全部是 API 厂商添加的协议/元数据

逐段拆解你这段流

① 第一条(不是 LLM 内容)

{
  "choices":[{
    "delta":{
      "content":"",
      "role":"assistant"
    }
  }]
}

含义:

  • role: "assistant"
    👉 API 协议字段,告诉客户端:

    接下来是 assistant 在说话

  • content: ""
    👉 空字符串,不是 LLM 输出

📌 结论
❌ 没有任何模型生成文本
✅ 这是 API 协议初始化 chunk

②~⑥ 中间这些(✅ LLM 真正输出)

"delta": { "content": "你好" }
"delta": { "content": "!我是通" }
"delta": { "content": "义千问" }
"delta": { "content": "有任何问题或需要帮助," }
"delta": { "content": "欢迎随时告诉我!😊" }

把它们拼起来:

你好!我是通义千问有任何问题或需要帮助,欢迎随时告诉我!😊

📌 结论:

  • 100% 是 LLM 生成的 token
  • SSE 是 按 token / 子串流式切分,不是一句一句给

👉 客户端正确做法:累加 delta.content

⑦ finish_reason = stop(不是 LLM 内容)

"finish_reason": "stop",
"delta": { "content": "" }

含义:

  • finish_reason: "stop"
    👉 API 告诉你:模型正常结束了
  • 不是模型“说了 stop”

📌 结论
❌ 不是 LLM 输出
结束信号

⑧ 最后一条 usage(纯 API 统计)

{
  "choices": [],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 66,
    "total_tokens": 88
  }
}

含义:

  • token 统计
  • 计费 / 监控用
  • 有些厂商只在 最后一个 chunk 才给

📌 结论
❌ 完全不是 LLM 输出
API 侧统计信息

总结成一张表(你写代码时很好用)

字段来源是否 LLM 输出
choices[].delta.contentLLM✅ 是
choices[].delta.roleAPI❌ 否
finish_reasonAPI❌ 否
usageAPI❌ 否
id / model / created / objectAPI❌ 否

实战建议(很重要)

✅ 正确拼接方式(伪代码)

if chunk.Choices[i].Delta.Content != "" {
    fullText += chunk.Choices[i].Delta.Content
}

❌ 常见坑

  • role 当成模型输出
  • finish_reason 当 token
  • 忘了最后还有一个 choices 为空的 usage chunk

如果你愿意,下一步我可以帮你:

  • ✔ 设计一个 通用 SSE LLM Stream Parser(兼容 OpenAI / Qwen / DeepSeek)
  • ✔ 帮你判断 什么时候该关闭 WebSocket / SSE
  • ✔ 对比 chat.completions.stream vs create(stream=true) 的实现差异

你这个问题已经是 写 SDK / 网关层的人 才会问的级别了 👍