From fa12f315a8d13354c48c7676f3990d4e350110ea Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 21:29:16 +0800 Subject: [PATCH 01/17] Update 07-Chapter-01-Prompt-Chaining.md --- 07-Chapter-01-Prompt-Chaining.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/07-Chapter-01-Prompt-Chaining.md b/07-Chapter-01-Prompt-Chaining.md index 341a1d7..931fc22 100644 --- a/07-Chapter-01-Prompt-Chaining.md +++ b/07-Chapter-01-Prompt-Chaining.md @@ -20,7 +20,7 @@ Furthermore, prompt chaining is not just about breaking down problems; it also e **Limitations of single prompts:** For multifaceted tasks, using a single, complex prompt for an LLM can be inefficient, causing the model to struggle with constraints and instructions, potentially leading to instruction neglect where parts of the prompt are overlooked, contextual drift where the model loses track of the initial context, error propagation where early errors amplify, prompts which require a longer context window where the model gets insufficient information to respond back and hallucination where the cognitive load increases the chance of incorrect information. For example, a query asking to analyze a market research report, summarize findings, identify trends with data points, and draft an email risks failure as the model might summarize well but fail to extract data or draft an email properly. -单一提示的局限性:对于包含多个子任务的复杂任务,使用单一复杂提示往往效率不高。模型可能难以同时满足多项约束和指示,从而出现以下问题:忽视部分指令、上下文漂移(contextual drift)、早期错误被放大、上下文超出窗口导致信息不足,以及因认知负担加重而产生幻觉。 +单一提示的局限性:对于包含多个子任务的复杂任务,使用单一复杂提示往往效率不高。模型可能难以同时满足多项约束和指示,从而出现以下问题:忽视部分指令、上下文漂移(Contextual Drift)、早期错误被放大、上下文超出窗口限制导致信息不足,以及因认知负担加重而产生幻觉。 例如,要求模型在单次调用中同时完成分析市场报告、总结要点、识别趋势和草拟邮件等多项任务,失败概率极高。模型或许能给出不错的总结,但在提取精确数据或撰写得体邮件这类更细致的环节上,就很容易出错。 @@ -227,7 +227,7 @@ This principle is fundamental to the development of conversational agents, enabl **6. Code Generation and Refinement:** The generation of functional code is typically a multi-stage process, requiring a problem to be decomposed into a sequence of discrete logical operations that are executed progressively -代码生成和优化:功能性代码的生成通常是一个多阶段的过程,它要求将问题分解为一系列可有序执行的逻辑操作。 +代码生成和优化:可用代码的生成通常是一个多阶段的过程,它要求将问题分解为一系列可有序执行的逻辑操作。 - Prompt 1: Understand the user's request for a code function. Generate pseudocode or an outline. @@ -378,13 +378,13 @@ This Python code demonstrates how to use the LangChain library to process text. Context Engineering (see Fig.1) is the systematic discipline of designing, constructing, and delivering a complete informational environment to an AI model prior to token generation. This methodology asserts that the quality of a model's output is less dependent on the model's architecture itself and more on the richness of the context provided. -上下文工程(Context Engineering,见图 1)是一门系统性的学科,它研究的是在 AI 模型生成词元(Token)之前,如何为其设计、构建并提供一个完整的信息环境。这一方法论主张,模型输出的质量与其说取决于模型自身的架构,不如说更依赖于所提供上下文的丰富程度。 +上下文工程(Context Engineering,见图 1)是一门系统性的学科,致力于在 AI 模型生成词元(Token)之前,如何为其设计、构建并提供一个完整的信息环境。这一方法论主张,模型输出的质量与其说取决于模型自身的架构,不如说更依赖于所提供上下文的丰富程度。 ![上下文工程](/images/chapter01_fig1.png) Fig.1: Context Engineering is the discipline of building a rich, comprehensive informational environment for an AI, as the quality of this context is a primary factor in enabling advanced Agentic performance. -图 1:上下文工程是一门为 AI 构建丰富、全面信息环境的学科,因为高质量的上下文是实现高级智能体性能的首要因素。 +图 1:上下文工程是一门为 AI 构建丰富、全面信息环境的学科,因为高质量的上下文是支撑高级智能体性能的首要因素。 It represents a significant evolution from traditional prompt engineering, which focuses primarily on optimizing the phrasing of a user's immediate query. Context Engineering expands this scope to include several layers of information, such as the system prompt, which is a foundational set of instructions defining the AI's operational parameters—for instance, "You are a technical writer; your tone must be formal and precise." The context is further enriched with external data. This includes retrieved documents, where the AI actively fetches information from a knowledge base to inform its response, such as pulling technical specifications for a project. It also incorporates tool outputs, which are the results from the AI using an external API to obtain real-time data, like querying a calendar to determine a user's availability. This explicit data is combined with critical implicit data, such as user identity, interaction history, and environmental state. The core principle is that even advanced models underperform when provided with a limited or poorly constructed view of the operational environment. From 7203510fa72803f381b4b9788ff3232c790b0fa5 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 21:35:43 +0800 Subject: [PATCH 02/17] Revise terminology and enhance document clarity Updated terminology and improved clarity in the document. --- 11-Chapter-05-Tool-Use.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/11-Chapter-05-Tool-Use.md b/11-Chapter-05-Tool-Use.md index 78196db..cbe369f 100644 --- a/11-Chapter-05-Tool-Use.md +++ b/11-Chapter-05-Tool-Use.md @@ -104,7 +104,7 @@ Using external calculators, data analysis libraries, or statistical tools. - 工具:计算器函数、股票行情接口、电子表格工具。 - 智能体流程:用户提问「苹果公司当前股价是多少?如果我以 150 美元买入 100 股,可能会赚多少钱?」,大语言模型会先调用股票行情接口获取最新价格,然后调用计算器工具计算收益,最后把结果整理并返回给用户。 -**4. Sending Communications:** | 发送通知: +**4. Sending Communications:** | 发送消息: Sending emails, messages, or making API calls to external communication services. @@ -398,7 +398,7 @@ financial_crew = Crew( # --- 5. Run the Crew within a Main Execution Block --- # Using a __name__ == "__main__": block is a standard Python best practice. # --- 5. 在主程序中运行 --- -# 使用 __name__ == "__main__": 块是 Python 的最佳实践。 +# 使用 __name__ == "__main__": 代码块是 Python 的最佳实践。 def main(): """Main function to run the crew.""" # Check for API key before starting to avoid runtime errors. @@ -661,7 +661,7 @@ This script uses Google's Agent Development Kit (ADK) to create an agent that so 核心的异步函数 call_vsearch_agent_async 用于与智能体交互,该函数接收查询请求构造为消息对象,并作为参数传给 run_async 方法从而实现将查询请求发送给智能体并等待异步事件返回。 -随后该函数以流式方式将智能体的响应输出到控制台,并打印关于最终响应的信息,包括来自数据存储的元数据。代码具备错误处理机制,以捕获智能体执行期间的异常,并提供有价值的上下文信息,如数据存储 ID 不正确或权限缺失等。 +随后该函数以流式方式将智能体的响应输出到控制台,并打印关于最终响应的信息,包括来自数据存储的引用来源。代码具备错误处理机制,以捕获智能体执行期间的异常,并提供有价值的上下文信息,如数据存储 ID 不正确或权限缺失等。 另一个异步函数 run_vsearch_example 用于演示如何调用该智能体。主执行块先检查 DATASTORE_ID 是否已设置,然后使用 asyncio.run 运行示例。代码最后还包含一个异常检查,避免在已有运行事件循环的环境(如 Jupyter notebook)中运行代码时出现错误。 From 18def1578688cb6b1f1da60fd2c5eac071a79b71 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 21:54:27 +0800 Subject: [PATCH 03/17] Update 13-Chapter-07-Multi-Agent-Collaboration.md --- 13-Chapter-07-Multi-Agent-Collaboration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/13-Chapter-07-Multi-Agent-Collaboration.md b/13-Chapter-07-Multi-Agent-Collaboration.md index 944f012..d3501b2 100644 --- a/13-Chapter-07-Multi-Agent-Collaboration.md +++ b/13-Chapter-07-Multi-Agent-Collaboration.md @@ -6,7 +6,7 @@ While a monolithic agent architecture can be effective for well-defined problems For example, a complex research query might be decomposed and assigned to a Research Agent for information retrieval, a Data Analysis Agent for statistical processing, and a Synthesis Agent for generating the final report. The efficacy of such a system is not merely due to the division of labor but is critically dependent on the mechanisms for inter-agent communication. This requires a standardized communication protocol and a shared ontology, allowing agents to exchange data, delegate sub-tasks, and coordinate their actions to ensure the final output is coherent. -例如,一个复杂的研究问题可以这样拆分:由研究智能体负责信息检索、数据分析智能体负责统计处理、综合智能体负责生成最终报告。这类系统的效果不仅源于分工,更取决于智能体之间的通信机制。为此需要标准化的通信协议和共享机制,使智能体之间能够交换数据、委派子任务和协调行动,以确保最终结果的连贯一致。 +例如,一个复杂的研究问题可以这样拆分:由研究智能体负责信息检索、数据分析智能体负责统计处理、综合智能体负责生成最终报告。这类系统的效果不仅源于分工,更取决于智能体之间的通信机制。为此需要标准化的通信协议和共享本体(Ontology),使智能体之间能够交换数据、委派子任务和协调行动,以确保最终结果的连贯一致。 This distributed architecture offers several advantages, including enhanced modularity, scalability, and robustness, as the failure of a single agent does not necessarily cause a total system failure. The collaboration allows for a synergistic outcome where the collective performance of the multi-agent system surpasses the potential capabilities of any single agent within the ensemble. @@ -100,7 +100,7 @@ Multi-Agent Collaboration is a powerful pattern applicable across numerous domai The capacity to delineate specialized agents and meticulously orchestrate their interrelationships empowers developers to construct systems exhibiting enhanced modularity, scalability, and the ability to address complexities that would prove insurmountable for a singular, integrated agent. -将任务拆分给多个专业智能体并精心协调它们的协作,可以让开发者构建出更具模块化和可扩展性的系统,从而解决单个整体智能体无法应对的复杂问题。这正是多智能体协作模式的核心价值所在。 +将任务拆分给多个专业智能体并精心编排它们的协作,可以让开发者构建出更具模块化和可扩展性的系统,从而解决单个整体智能体无法应对的复杂问题。这正是多智能体协作模式的核心价值所在。 --- From 42f5ade0967b04c0ca54890bf777c41c0e018d12 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:00:03 +0800 Subject: [PATCH 04/17] Refine Exception Handling and Recovery pattern content Updated the Exception Handling and Recovery pattern overview with improved clarity and additional examples. Enhanced the Chinese translation for consistency and accuracy. --- 15-Chapter-12-Exception-Handling-and-Recovery.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/15-Chapter-12-Exception-Handling-and-Recovery.md b/15-Chapter-12-Exception-Handling-and-Recovery.md index a41fe3a..cf87630 100644 --- a/15-Chapter-12-Exception-Handling-and-Recovery.md +++ b/15-Chapter-12-Exception-Handling-and-Recovery.md @@ -20,7 +20,7 @@ This pattern may sometimes be used with reflection. For example, if an initial a The Exception Handling and Recovery pattern addresses the need for AI agents to manage operational failures. This pattern involves anticipating potential issues, such as tool errors or service unavailability, and developing strategies to mitigate them. These strategies may include error logging, retries, fallbacks, graceful degradation, and notifications. Additionally, the pattern emphasizes recovery mechanisms like state rollback, diagnosis, self-correction, and escalation, to restore agents to stable operation. Implementing this pattern enhances the reliability and robustness of AI agents, allowing them to function in unpredictable environments. Examples of practical applications include chatbots managing database errors, trading bots handling financial errors, and smart home agents addressing device malfunctions. The pattern ensures that agents can continue to operate effectively despite encountering complexities and failures. -「异常处理与恢复」模式解决了 AI 智能体管理运行故障的需求。该模式涉及预测潜在问题(如工具错误或服务不可用)并制定缓解策略。这些策略可能包括错误日志记录、重试、回退、优雅降级和通知。此外,该模式还强调了恢复机制(如状态回滚、诊断、自我纠正和上报升级),以使智能体恢复到稳定运行状态。实施此模式可增强 AI 智能体的可靠性和鲁棒性,使其能够在不可预测的环境中运行。实际应用示例包括:聊天机器人管理数据库错误、交易机器人处理金融错误,以及智能家居智能体解决设备故障。该模式确保智能体在遇到复杂情况和失败时仍能继续有效运行。 +「异常处理与恢复」模式解决了 AI 智能体管理运行故障的需求。该模式涉及预测潜在问题(如工具错误或服务不可用)并制定缓解策略。这些策略可能包括错误日志记录、重试、回退、优雅降级和通知。此外,该模式还强调了恢复机制(如状态回滚、诊断、自我纠正和升级处理),以使智能体恢复到稳定运行状态。实施此模式可增强 AI 智能体的可靠性和鲁棒性,使其能够在不可预测的环境中运行。实际应用示例包括:聊天机器人管理数据库错误、交易机器人处理金融错误,以及智能家居智能体解决设备故障。该模式确保智能体在遇到复杂情况和失败时仍能继续有效运行。 ![](/images/chapter12_fig1.png "Key Components") Fig.1: Key components of exception handling and recovery for AI agents @@ -67,7 +67,7 @@ Exception Handling and Recovery is critical for any agent deployed in a real-wor - Web Scraping Agents: When a web scraping agent encounters a CAPTCHA, a changed website structure, or a server error (e.g., 404 Not Found, 503 Service Unavailable), it needs to handle these gracefully. This could involve pausing, using a proxy, or reporting the specific URL that failed. -- 网页抓取智能体:当网页抓取智能体遇到 CAPTCHA、网站结构变更或服务器错误(例如,404 Not Found、503 Service Unavailable)时,它需要优雅地处理这些情况。这可能包括暂停、使用代理或报告失败的具体 URL。 +- 网页抓取智能体:当网页抓取智能体遇到验证码(CAPTCHA)、网站结构变更或服务器错误(例如,404 Not Found、503 Service Unavailable)时,它需要优雅地处理这些情况。这可能包括暂停、使用代理或报告失败的具体 URL。 - Robotics and Manufacturing: A robotic arm performing an assembly task might fail to pick up a component due to misalignment. It needs to detect this failure (e.g., via sensor feedback), attempt to readjust, retry the pickup, and if persistent, alert a human operator or switch to a different component. @@ -147,7 +147,7 @@ Why: The Exception Handling and Recovery pattern provides a standardized solutio Rule of thumb: Use this pattern for any AI agent deployed in a dynamic, real-world environment where system failures, tool errors, network issues, or unpredictable inputs are possible and operational reliability is a key requirement. -经验法则:任何部署在动态、真实世界环境且对操作可靠性要求极高的 AI 智能体,在这些场景中可能遭遇系统故障、工具错误、网络问题或不可预测的输入。 +经验法则:本模式适用于任何部署在动态、真实世界环境中的 AI 智能体,特别是当这些场景可能遭遇系统故障、工具错误、网络问题或不可预测的输入,且对操作可靠性有关键要求时。 # Visual Summary | 可视化总结 From 008001f9a2537778ad7ea7c2e0f407e7252505b0 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:03:21 +0800 Subject: [PATCH 05/17] Refine language and clarity in goal-setting chapter --- 17-Chapter-11-Goal-Setting-And-Monitoring.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/17-Chapter-11-Goal-Setting-And-Monitoring.md b/17-Chapter-11-Goal-Setting-And-Monitoring.md index 9b60c9c..bb00b8d 100644 --- a/17-Chapter-11-Goal-Setting-And-Monitoring.md +++ b/17-Chapter-11-Goal-Setting-And-Monitoring.md @@ -2,7 +2,7 @@ For AI agents to be truly effective and purposeful, they need more than just the ability to process information or use tools; they need a clear sense of direction and a way to know if they're actually succeeding. This is where the Goal Setting and Monitoring pattern comes into play. It's about giving agents specific objectives to work towards and equipping them with the means to track their progress and determine if those objectives have been met. -要让 AI 智能体真正有效且有目的性,它们需要的不仅仅是处理信息或使用工具的能力;它们需要明确的方向感,并能够知道自己是否真的在取得成功。这就是目标设定与监控模式发挥作用的地方。该模式旨在为智能体提供要努力实现的具体目标,并配备跟踪进度和判断这些目标是否实现的手段。 +要让 AI 智能体真正有效且有目的性,它们不仅仅需要处理信息或使用工具的能力,更需要明确的方向感,并能够知道自己是否真的在取得成功。这就是目标设定与监控模式发挥作用的地方。该模式旨在为智能体提供要努力实现的具体目标,并配备跟踪进度和判断这些目标是否实现的手段。 ## Goal Setting and Monitoring Pattern Overview | 目标设定与监控模式概述 @@ -66,7 +66,7 @@ It employs a "goal-setting and monitoring" pattern where it doesn't just generat You can best understand this script by imagining it as an autonomous AI programmer assigned to a project (see Fig. 1). The process begins when you hand the AI a detailed project brief, which is the specific coding problem it needs to solve. -您可以把它想象为,一个被分配到项目的自主 AI 程序员,这样可以更好地理解这个脚本(见图 1)。当您向 AI 提供详细的项目简报时 - 就是它需要解决的特定编程问题 - 它就开始工作了。 +您可以把它想象为,一个被分配到项目的自主 AI 程序员,这样可以更好地理解这个脚本(见图 1)。当您向 AI 提供详细的项目简报时 - 就是它需要解决的特定编程问题——它就开始工作了。 ```python # MIT License @@ -327,7 +327,7 @@ With this assignment in hand, the AI programmer gets to work and produces its fi If the verdict is "False," the AI doesn't give up. It enters a thoughtful revision phase, using the insights from its self-critique to pinpoint the weaknesses and intelligently rewrite the code. This cycle of drafting, self-reviewing, and refining continues, with each iteration aiming to get closer to the goals. This process repeats until the AI finally achieves a "True" status by satisfying every requirement, or until it reaches a predefined limit of attempts, much like a developer working against a deadline. Once the code passes this final inspection, the script packages the polished solution, adding helpful comments and saving it to a clean, new Python file, ready for use. -如果评判结果为“False”,AI 也不会放弃。它会进入一个深思熟虑的修订阶段,利用自我批判的见解来找出弱点,并智能地重写代码。这种起草、自我审查和优化的循环持续进行,朝向目标一次次迭代。这个过程重复进行,直到 AI 满足每一个要求,最终达到“True”状态,或者达到预先设定的尝试次数限制 - 就像一个面对截止日期的开发者一样。一旦代码通过了最终检查,脚本就会打包经过润色的解决方案,添加有用的注释,并将其保存到一个新的 Python 文件中,以待使用。 +如果评判结果为“False”,AI 也不会放弃。它会进入一个深思熟虑的修订阶段,利用自我批判的见解来找出弱点,并智能地重写代码。这种起草、自我审查和优化的循环持续进行,朝向目标一次次迭代。这个过程重复进行,直到 AI 满足每一个要求,最终达到“True”状态,或者达到预先设定的尝试次数限制——就像一个面对截止日期的开发者一样。一旦代码通过了最终检查,脚本就会打包经过润色的解决方案,添加有用的注释,并将其保存到一个新的 Python 文件中,以待使用。 **Caveats and Considerations:** It is important to note that this is an exemplary illustration and not production-ready code. For real-world applications, several factors must be taken into account. An LLM may not fully grasp the intended meaning of a goal and might incorrectly assess its performance as successful. Even if the goal is well understood, the model may hallucinate. When the same LLM is responsible for both writing the code and judging its quality, it may have a harder time discovering it is going in the wrong direction. From b63ef033d6269dd2f7bbf6ac318c6be125426f4a Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:06:28 +0800 Subject: [PATCH 06/17] Update exception handling pattern overview and examples --- 18-Chapter-12-Exception-Handling-and-Recovery.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/18-Chapter-12-Exception-Handling-and-Recovery.md b/18-Chapter-12-Exception-Handling-and-Recovery.md index 7f636d1..2f4a037 100644 --- a/18-Chapter-12-Exception-Handling-and-Recovery.md +++ b/18-Chapter-12-Exception-Handling-and-Recovery.md @@ -20,7 +20,7 @@ This pattern may sometimes be used with reflection. For example, if an initial a The Exception Handling and Recovery pattern addresses the need for AI agents to manage operational failures. This pattern involves anticipating potential issues, such as tool errors or service unavailability, and developing strategies to mitigate them. These strategies may include error logging, retries, fallbacks, graceful degradation, and notifications. Additionally, the pattern emphasizes recovery mechanisms like state rollback, diagnosis, self-correction, and escalation, to restore agents to stable operation. Implementing this pattern enhances the reliability and robustness of AI agents, allowing them to function in unpredictable environments. Examples of practical applications include chatbots managing database errors, trading bots handling financial errors, and smart home agents addressing device malfunctions. The pattern ensures that agents can continue to operate effectively despite encountering complexities and failures. -「异常处理与恢复」模式解决了 AI 智能体管理运行故障的需求。该模式涉及预测潜在问题(如工具错误或服务不可用)并制定缓解策略。这些策略可能包括错误日志记录、重试、回退、优雅降级和通知。此外,该模式还强调了恢复机制(如状态回滚、诊断、自我纠正和上报升级),以使智能体恢复到稳定运行状态。实施此模式可增强 AI 智能体的可靠性和鲁棒性,使其能够在不可预测的环境中运行。实际应用示例包括:聊天机器人管理数据库错误、交易机器人处理金融错误,以及智能家居智能体解决设备故障。该模式确保智能体在遇到复杂情况和失败时仍能继续有效运行。 +「异常处理与恢复」模式解决了 AI 智能体管理运行故障的需求。该模式涉及预测潜在问题(如工具错误或服务不可用)并制定缓解策略。这些策略可能包括错误日志记录、重试、回退、优雅降级和通知。此外,该模式还强调了恢复机制(如状态回滚、诊断、自我纠正和升级处理),以使智能体恢复到稳定运行状态。实施此模式可增强 AI 智能体的可靠性和鲁棒性,使其能够在不可预测的环境中运行。实际应用示例包括:聊天机器人管理数据库错误、交易机器人处理金融错误,以及智能家居智能体解决设备故障。该模式确保智能体在遇到复杂情况和失败时仍能继续有效运行。 ![](/images/chapter12_fig1.png "Key Components") Fig.1: Key components of exception handling and recovery for AI agents @@ -67,7 +67,7 @@ Exception Handling and Recovery is critical for any agent deployed in a real-wor - Web Scraping Agents: When a web scraping agent encounters a CAPTCHA, a changed website structure, or a server error (e.g., 404 Not Found, 503 Service Unavailable), it needs to handle these gracefully. This could involve pausing, using a proxy, or reporting the specific URL that failed. -- 网页抓取智能体:当网页抓取智能体遇到 CAPTCHA、网站结构变更或服务器错误(例如,404 Not Found、503 Service Unavailable)时,它需要优雅地处理这些情况。这可能包括暂停、使用代理或报告失败的具体 URL。 +- 网页抓取智能体:当网页抓取智能体遇到验证码(CAPTCHA)、网站结构变更或服务器错误(例如,404 Not Found、503 Service Unavailable)时,它需要优雅地处理这些情况。这可能包括暂停、使用代理或报告失败的具体 URL。 - Robotics and Manufacturing: A robotic arm performing an assembly task might fail to pick up a component due to misalignment. It needs to detect this failure (e.g., via sensor feedback), attempt to readjust, retry the pickup, and if persistent, alert a human operator or switch to a different component. @@ -147,7 +147,7 @@ Why: The Exception Handling and Recovery pattern provides a standardized solutio Rule of thumb: Use this pattern for any AI agent deployed in a dynamic, real-world environment where system failures, tool errors, network issues, or unpredictable inputs are possible and operational reliability is a key requirement. -经验法则:任何部署在动态、真实世界环境且对操作可靠性要求极高的 AI 智能体,在这些场景中可能遭遇系统故障、工具错误、网络问题或不可预测的输入。 +经验法则:本模式适用于任何部署在动态、真实世界环境中的 AI 智能体,特别是当这些场景中可能遭遇系统故障、工具错误、网络问题或不可预测的输入,且对操作可靠性有关键要求时。 # Visual Summary | 可视化总结 From 3edbe38721f2b794431570c9c1ad3077fe39638d Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:10:29 +0800 Subject: [PATCH 07/17] Refine explanations and translations in Knowledge Retrieval Updated the text to improve clarity and consistency in the explanations of embeddings, text similarity, chunking, and Graph RAG. Adjusted translations and phrasing for better readability. --- 20-Chapter-14-Knowledge-Retrieval-RAG.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/20-Chapter-14-Knowledge-Retrieval-RAG.md b/20-Chapter-14-Knowledge-Retrieval-RAG.md index 2b41df7..d97d1e5 100644 --- a/20-Chapter-14-Knowledge-Retrieval-RAG.md +++ b/20-Chapter-14-Knowledge-Retrieval-RAG.md @@ -28,7 +28,7 @@ To fully appreciate how RAG functions, it's essential to understand a few core c Embeddings: In the context of LLMs, embeddings are numerical representations of text, such as words, phrases, or entire documents. These representations are in the form of a vector, which is a list of numbers. The key idea is to capture the semantic meaning and the relationships between different pieces of text in a mathematical space. Words or phrases with similar meanings will have embeddings that are closer to each other in this vector space. For instance, imagine a simple 2D graph. The word "cat" might be represented by the coordinates (2, 3), while "kitten" would be very close at (2.1, 3.1). In contrast, the word "car" would have a distant coordinate like (8, 1), reflecting its different meaning. In reality, these embeddings are in a much higher-dimensional space with hundreds or even thousands of dimensions, allowing for a very nuanced understanding of language. -嵌入(Embeddings):在 LLM 的语境中,嵌入是以数字形式表示文本,例如词语、短语或整个文档。这些表示以向量(即数字的列表)的形式存在。其核心思想是在一个数学空间中捕捉不同文本片段之间的语义含义和关系。含义相近的词语或短语,其嵌入在向量空间中的距离也更近。例如,在一个简单的二维图表中,「cat」(猫)一词可能由坐标 (2, 3) 表示,而「kitten」(小猫)也会位于非常接近的 (2.1, 3.1)。相比之下,「car」(小汽车)一词的坐标则可能在很远的位置,如 (8, 1),反映了其不同的含义。实际上,这些嵌入存在于维度高得多的空间中,拥有数百甚至数千个维度,从而能够对语言有非常细致的理解。 +嵌入(Embeddings):在 LLM 的语境中,嵌入文本的数值,例如词语、短语或整个文档。这些表示以向量(即数字的列表)的形式存在。其核心思想是在一个数学空间中捕捉不同文本片段之间的语义含义和关系。含义相近的词语或短语,其嵌入在向量空间中的距离也更近。例如,在一个简单的二维图表中,「cat」(猫)一词可能由坐标 (2, 3) 表示,而「kitten」(小猫)也会位于非常接近的 (2.1, 3.1)。相比之下,「car」(小汽车)一词的坐标则可能在很远的位置,如 (8, 1),反映了其不同的含义。实际上,这些嵌入存在于维度高得多的空间中,拥有数百甚至数千个维度,从而能够对语言有非常细致的理解。 Text Similarity: Text similarity refers to the measure of how alike two pieces of text are. This can be at a surface level, looking at the overlap of words (lexical similarity), or at a deeper, meaning-based level. In the context of RAG, text similarity is crucial for finding the most relevant information in the knowledge base that corresponds to a user's query. @@ -53,7 +53,7 @@ Fig.1: RAG Core Concepts: Chunking, Embeddings, and Vector Database Chunking of Documents: Chunking is the process of breaking down large documents into smaller, more manageable pieces, or "chunks." For a RAG system to work efficiently, it cannot feed entire large documents into the LLM. Instead, it processes these smaller chunks. The way documents are chunked is important for preserving the context and meaning of the information. -文档分块:分块是将大型文档分解成更小、更易于管理的小块或「块」(chunks)的过程。为了让 RAG 系统高效工作,不能将整个大型文档输入给 LLM,而是处理这些较小的块。文档的分块方式对于保留信息的上下文和含义非常重要。 +文档分块:分块是将大型文档分解成更小、更易于管理的片段或「块」(chunks)的过程。为了让 RAG 系统高效工作,不能将整个大型文档输入给 LLM,而是处理这些较小的块。文档的分块方式对于保留信息的上下文和含义非常重要。 For instance, instead of treating a 50-page user manual as a single block of text, a chunking strategy might break it down into sections, paragraphs, or even sentences. For instance, a section on "Troubleshooting" would be a separate chunk from the "Installation Guide." When a user asks a question about a specific problem, the RAG system can then retrieve the most relevant troubleshooting chunk, rather than the entire manual. This makes the retrieval process faster and the information provided to the LLM more focused and relevant to the user's immediate need. @@ -85,7 +85,7 @@ The system's effectiveness is also highly dependent on the quality of the chunki Besides that, another challenge is that RAG requires the entire knowledge base to be pre-processed and stored in specialized databases, such as vector or graph databases, which is a considerable undertaking. Consequently, this knowledge requires periodic reconciliation to remain up-to-date, a crucial task when dealing with evolving sources like company wikis. This entire process can have a noticeable impact on performance, increasing latency, operational costs, and the number of tokens used in the final prompt. -除此之外,另一个挑战是 RAG 要求整个知识库都经过预处理,并且存储在专门的数据库中,如向量数据库或图数据库。这是一项相当大的工程。因此,这些知识需要定期的同步以保持更新。在处理像公司维基这样不断变化的来源时,这就是一项至关重要的任务。整个过程可能对性能产生显著影响,增加延迟、运营成本以及最终 Prompt 中使用的 token 数量。 +除此之外,另一个挑战是 RAG 要求整个知识库都经过预处理,并且存储在专门的数据库中,如向量数据库或图数据库。这是一项相当大的工程。因此,这些知识需要定期的校准以保持时效性。在处理像公司维基这样不断变化的来源时,这就是一项至关重要的任务。整个过程可能对性能产生显著影响,增加延迟、运营成本以及最终 Prompt 中使用的 token 数量。 In summary, the Retrieval-Augmented Generation (RAG) pattern represents a significant leap forward in making AI more knowledgeable and reliable. By seamlessly integrating an external knowledge retrieval step into the generation process, RAG addresses some of the core limitations of standalone LLMs. The foundational concepts of embeddings and semantic similarity, combined with retrieval techniques like keyword and hybrid search, allow the system to intelligently find relevant information, which is made manageable through strategic chunking. @@ -97,7 +97,7 @@ This entire retrieval process is powered by specialized vector databases designe Graph RAG: GraphRAG is an advanced form of Retrieval-Augmented Generation that utilizes a knowledge graph instead of a simple vector database for information retrieval. It answers complex queries by navigating the explicit relationships (edges) between data entities (nodes) within this structured knowledge base. A key advantage is its ability to synthesize answers from information fragmented across multiple documents, a common failing of traditional RAG. By understanding these connections, GraphRAG provides more contextually accurate and nuanced responses. -图 RAG(Graph RAG):GraphRAG 是一种先进的 RAG 形式,它利用知识图谱而非简单的向量数据库进行信息检索。它通过在结构化知识库中导航数据实体(节点)之间的显式关系(边)来回答复杂查询。其一个关键优势是能够综合来自多个文档中的碎片化信息来生成答案,而这正是传统 RAG 的一个常见短板。通过理解这些联系,GraphRAG 能够提供与上下文相符、更细致的响应。 +图 RAG(Graph RAG):GraphRAG 是一种先进的 RAG 形式,它利用知识图谱而非简单的向量数据库进行信息检索。它通过在结构化知识库中遍历数据实体(节点)之间的显式关系(边)来回答复杂查询。其一个关键优势是能够综合来自多个文档中的碎片化信息来生成答案,而这正是传统 RAG 的一个常见短板。通过理解这些联系,GraphRAG 能够提供与上下文相符、更细致的响应。 Use cases include complex financial analysis, connecting companies to market events, and scientific research for discovering relationships between genes and diseases. The primary drawback, however, is the significant complexity, cost, and expertise required to build and maintain a high-quality knowledge graph. This setup is also less flexible and can introduce higher latency compared to simpler vector search systems. The system's effectiveness is entirely dependent on the quality and completeness of the underlying graph structure. Consequently, GraphRAG offers superior contextual reasoning for intricate questions but at a much higher implementation and maintenance cost. In summary, it excels where deep, interconnected insights are more critical than the speed and simplicity of standard RAG. @@ -118,7 +118,7 @@ Fig.2: Agentic RAG introduces a reasoning agent that actively evaluates, reconci Second, an agent is adept at reconciling knowledge conflicts. Imagine a financial analyst asks, "What was Project Alpha's Q1 budget?" The system retrieves two documents: an initial proposal stating a €50,000 budget and a finalized financial report listing it as €65,000. An Agentic RAG would identify this contradiction, prioritize the financial report as the more reliable source, and provide the LLM with the verified figure, ensuring the final answer is based on the most accurate data. -其次,智能体善于解决知识冲突。想象一位财务分析师问:「Alpha 项目第一季度的预算是多少?」系统检索到两份文件:一份是初步提案,预算为 50,000 欧元;另一份是最终财务报告,预算为 65,000 欧元。一个智能体式 RAG 会识别出这种矛盾,将财务报告作为更可靠的来源优先处理,并向 LLM 提供核实后的数字,确保最终答案基于最准确的数据。 +其次,智能体善于调和知识冲突。想象一位财务分析师问:「Alpha 项目第一季度的预算是多少?」系统检索到两份文件:一份是初步提案,预算为 50,000 欧元;另一份是最终财务报告,预算为 65,000 欧元。一个智能体式 RAG 会识别出这种矛盾,将财务报告作为更可靠的来源优先处理,并向 LLM 提供核实后的数字,确保最终答案基于最准确的数据。 Third, an agent can perform multi-step reasoning to synthesize complex answers. If a user asks, "How do our product's features and pricing compare to Competitor X's?" the agent would decompose this into separate sub-queries. It would initiate distinct searches for its own product's features, its pricing, Competitor X's features, and Competitor X's pricing. After gathering these individual pieces of information, the agent would synthesize them into a structured, comparative context before feeding it to the LLM, enabling a comprehensive response that a simple retrieval could not have produced. From 5ce1c06ac9194351532b61319fb802adb49a8d08 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:12:30 +0800 Subject: [PATCH 08/17] Refine content on ADK multi-agent architecture Updated the text to enhance clarity and consistency in the discussion of Google's ADK and its multi-agent architecture. --- 22-Chapter-16-Resource-Aware-Optimization.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/22-Chapter-16-Resource-Aware-Optimization.md b/22-Chapter-16-Resource-Aware-Optimization.md index 12fd25b..8955331 100644 --- a/22-Chapter-16-Resource-Aware-Optimization.md +++ b/22-Chapter-16-Resource-Aware-Optimization.md @@ -58,7 +58,7 @@ However, once the plan is established, the individual tasks within that plan, su Google's ADK supports this approach through its multi-agent architecture, which allows for modular and scalable applications. Different agents can handle specialized tasks. Model flexibility enables the direct use of various Gemini models, including both Gemini Pro and Gemini Flash, or integration of other models through LiteLLM. The ADK's orchestration capabilities support dynamic, LLM-driven routing for adaptive behavior. Built-in evaluation features allow systematic assessment of agent performance, which can be used for system refinement (see the Chapter on Evaluation and Monitoring). -Google 的 ADK 通过其多智能体架构支持这种方法,允许模块化和可扩展的应用程序。不同的智能体可以处理专门的任务。模型灵活性支持直接使用各种 Gemini 模型,包括 Gemini ProGemini Flash,或通过 LiteLLM 集成其他模型。ADK 的编排功能支持动态的、由 LLM 驱动的路由,以实现自适应行为。内置的评估功能允许系统化评估智能体性能,可用于系统优化(参见评估与监控章节)。 +Google 的 ADK 通过其多智能体架构支持这种方法,支持模块化和可扩展的应用程序。不同的智能体可以处理专门的任务。模型灵活性支持直接使用各种 Gemini 模型,包括 Gemini ProGemini Flash,或通过 LiteLLM 集成其他模型。ADK 的编排功能支持动态的、由 LLM 驱动的路由,以实现自适应行为。内置的评估功能允许系统化评估智能体性能,可用于系统优化(参见评估与监控章节)。 Next, two agents with identical setup but utilizing different models and costs will be defined. @@ -480,7 +480,7 @@ Resource-aware optimization is paramount in developing intelligent agent systems * **Optimization Through Feedback and Flexibility**: Evaluation capabilities for critique and model integration flexibility contribute to adaptive and self-improving system behavior. -* 通过反馈和灵活性进行优化:评论的评估能力和模型集成灵活性有助于自适应和自我改进的系统行为。 +* 通过反馈和灵活性进行优化:评论(智能体)的评估能力和模型集成灵活性有助于自适应和自我改进的系统行为。 * **Additional Resource-Aware Optimizations**: Other methods include Adaptive Tool Use & Selection, Contextual Pruning & Summarization, Proactive Resource Prediction, Cost-Sensitive Exploration in Multi-Agent Systems, Energy-Efficient Deployment, Parallelization & Distributed Computing Awareness, Learned Resource Allocation Policies, Graceful Degradation and Fallback Mechanisms, and Prioritization of Critical Tasks. From ef10f9400123362bda4d4f97d37c68ddf049d519 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:14:49 +0800 Subject: [PATCH 09/17] Enhance AI agent evaluation methods and contract processes Refactor evaluation methods for AI agents, detailing test and evalset file structures and their purposes. Emphasize the importance of formalized contracts and dynamic negotiation in AI interactions. --- 25-Chapter-19-Evaluation-and-Monitoring.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/25-Chapter-19-Evaluation-and-Monitoring.md b/25-Chapter-19-Evaluation-and-Monitoring.md index 51f54b0..dbbd879 100644 --- a/25-Chapter-19-Evaluation-and-Monitoring.md +++ b/25-Chapter-19-Evaluation-and-Monitoring.md @@ -331,7 +331,7 @@ This involves examining the quality of decisions, the reasoning process, and the Evaluation of AI agents involves two primary approaches: using test files and using evalset files. Test files, in JSON format, represent single, simple agent-model interactions or sessions and are ideal for unit testing during active development, focusing on rapid execution and simple session complexity. Each test file contains a single session with multiple turns, where a turn is a user-agent interaction including the user's query, expected tool use trajectory, intermediate agent responses, and final response. For example, a test file might detail a user request to "Turn off device_2 in the Bedroom," specifying the agent's use of a set_device_info tool with parameters like location: Bedroom, device_id: device_2, and status: OFF, and an expected final response of "I have set the device_2 status to off." Test files can be organized into folders and may include a test_config.json file to define evaluation criteria. Evalset files utilize a dataset called an "evalset" to evaluate interactions, containing multiple potentially lengthy sessions suited for simulating complex, multi-turn conversations and integration tests. An evalset file comprises multiple "evals," each representing a distinct session with one or more "turns" that include user queries, expected tool use, intermediate responses, and a reference final response. An example evalset might include a session where the user first asks "What can you do?" and then says "Roll a 10 sided dice twice and then check if 9 is a prime or not," defining expected roll\\\_die tool calls and a check_prime tool call, along with the final response summarizing the dice rolls and the prime check. -AI 智能体的评估涉及两种主要方法:使用测试文件和使用评估集(evalset)文件。测试文件采用 JSON 格式,代表单个、简单的智能体-模型交互或会话,非常适合在积极开发期间进行单元测试,专注于快速执行和简单的会话复杂性。每个测试文件包含一个具有多个回合的单个会话,其中回合是用户-智能体交互,包括用户查询、预期工具使用轨迹、中间智能体响应和最终响应。例如,测试文件可能详细说明用户请求「Turn off device_2 in the Bedroom」,指定智能体使用 set_device_info 工具及参数如 location: Bedroomdevice_id: device_2status: OFF,以及预期的最终响应「I have set the device_2 status to off」。测试文件可以组织到文件夹中,并可能包含一个 test_config.json 文件来定义评估标准。评估集文件利用称为 evalset 的数据集来评估交互,包含多个可能很长的会话,适合模拟复杂的多回合对话和集成测试。评估集文件包含多个 evals,每个代表一个独特的会话,具有一个或多个回合,包括用户查询、预期工具使用、中间响应和参考最终响应。示例评估集可能包括一个会话,其中用户首先询问「What can you do?」,然后说「Roll a 10 sided dice twice and then check if 9 is a prime or not」,定义预期的 roll_die 工具调用和 check_prime 工具调用,以及总结骰子投掷和素数检查的最终响应。 +AI 智能体的评估涉及两种主要方法:使用测试文件和使用评估集(evalset)文件。测试文件采用 JSON 格式,代表单个、简单的智能体-模型交互或会话,非常适合在积极开发期间进行单元测试,专注于快速执行和简单的会话复杂性。每个测试文件包含一个具有多个回合的单个会话,其中回合是用户-智能体交互,包括用户查询、预期工具使用轨迹、中间智能体响应和最终响应。例如,测试文件可能详细说明用户请求「Turn off device_2 in the Bedroom」,指定智能体使用 set_device_info 工具及参数如 location: Bedroomdevice_id: device_2status: OFF,以及预期的最终响应「I have set the device_2 status to off」。测试文件可以组织到文件夹中,并可能包含一个 test_config.json 文件来定义评估标准。评估集文件利用称为 evalset 的数据集来评估交互,包含多个可能很长的会话,适合模拟复杂的多回合对话和集成测试。评估集文件包含多个 evals,每个代表一个独立的会话,具有一个或多个回合,包括用户查询、预期工具使用、中间响应和参考最终响应。示例评估集可能包括一个会话,其中用户首先询问「What can you do?」,然后说「Roll a 10 sided dice twice and then check if 9 is a prime or not」,定义预期的 roll_die 工具调用和 check_prime 工具调用,以及总结骰子投掷和素数检查的最终响应。 **Multi-agents**: Evaluating a complex AI system with multiple agents is much like assessing a team project. Because there are many steps and handoffs, its complexity is an advantage, allowing you to check the quality of work at each stage. You can examine how well each individual "agent" performs its specific job, but you must also evaluate how the entire system is performing as a whole. @@ -371,11 +371,11 @@ Today's common AI agents operate on brief, underspecified instructions, which ma First is the pillar of the Formalized Contract, a detailed specification that serves as the single source of truth for a task. It goes far beyond a simple prompt. For example, a contract for a financial analysis task wouldn't just say "analyze last quarter's sales"; it would demand "a 20-page PDF report analyzing European market sales from Q1 2025, including five specific data visualizations, a comparative analysis against Q1 2024, and a risk assessment based on the included dataset of supply chain disruptions." This contract explicitly defines the required deliverables, their precise specifications, the acceptable data sources, the scope of work, and even the expected computational cost and completion time, making the outcome objectively verifiable. -首先是正式化合约(Formalized Contract)的支柱,这是一个详细的规范,作为任务的单一真实来源。它远远超出了简单的提示。例如,财务分析任务的合约不会只是说「分析上一季度的销售额」;它会要求「一份 20 页的 PDF 报告,分析 2025 年第一季度欧洲市场销售情况,包括五个特定的数据可视化、与 2024 年第一季度的比较分析,以及基于包含的供应链中断数据集的风险评估」。此合约明确定义了所需的交付成果、其精确规范、可接受的数据源、工作范围,甚至预期的计算成本和完成时间,使结果客观可验证。 +第一个支柱是正式化合约(Formalized Contract),这是一个详细的规范,作为任务的单一真实来源。它远远超出了简单的提示。例如,财务分析任务的合约不会只是说「分析上一季度的销售额」;它会要求「一份 20 页的 PDF 报告,分析 2025 年第一季度欧洲市场销售情况,包括五个特定的数据可视化、与 2024 年第一季度的比较分析,以及基于包含的供应链中断数据集的风险评估」。此合约明确定义了所需的交付成果、其精确规范、可接受的数据源、工作范围,甚至预期的计算成本和完成时间,使结果客观可验证。 Second is the pillar of a Dynamic Lifecycle of Negotiation and Feedback. The contract is not a static command but the start of a dialogue. The contractor agent can analyze the initial terms and negotiate. For instance, if a contract demands the use of a specific proprietary data source the agent cannot access, it can return feedback stating, "The specified XYZ database is inaccessible. Please provide credentials or approve the use of an alternative public database, which may slightly alter the data's granularity." This negotiation phase, which also allows the agent to flag ambiguities or potential risks, resolves misunderstandings before execution begins, preventing costly failures and ensuring the final output aligns perfectly with the user's actual intent. -第二是协商与反馈的动态生命周期(Dynamic Lifecycle of Negotiation and Feedback)的支柱。合约不是静态命令,而是对话的开始。承包商智能体可以分析初始条款并协商。例如,如果合约要求使用智能体无法访问的特定专有数据源,它可以返回反馈说:「指定的 XYZ 数据库无法访问。请提供凭据或批准使用替代公共数据库,这可能会略微改变数据的粒度。」这个协商阶段还允许智能体标记模糊性或潜在风险,在执行开始之前解决误解,防止代价高昂的失败,并确保最终输出完全符合用户的实际意图。 +第二个支柱是协商与反馈的动态生命周期(Dynamic Lifecycle of Negotiation and Feedback)。合约不是静态命令,而是对话的开始。承包商智能体可以分析初始条款并协商。例如,如果合约要求使用智能体无法访问的特定专有数据源,它可以返回反馈说:「指定的 XYZ 数据库无法访问。请提供凭据或批准使用替代公共数据库,这可能会略微改变数据的粒度。」这个协商阶段还允许智能体标记模糊性或潜在风险,在执行开始之前解决误解,防止代价高昂的失败,并确保最终输出完全符合用户的实际意图。 ![Contract execution example among agents](/images/chapter19_fig2.png) @@ -499,4 +499,4 @@ Relevant research includes: - ADK Evaluate: - Survey on Evaluation of LLM-based Agents, - Agent-as-a-Judge: Evaluate Agents with Agents, -- Agent Companion, gulli et al: \ No newline at end of file +- Agent Companion, gulli et al: From 1e6c55d73f24dbb4f2a2c9784af2ea55387b4262 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:18:38 +0800 Subject: [PATCH 10/17] Refine Chinese translations in Exploration and Discovery Updated Chinese translations for clarity and consistency. --- 27-Chapter-21-Exploration-and-Discovery.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/27-Chapter-21-Exploration-and-Discovery.md b/27-Chapter-21-Exploration-and-Discovery.md index 91ff7a9..0ebb890 100644 --- a/27-Chapter-21-Exploration-and-Discovery.md +++ b/27-Chapter-21-Exploration-and-Discovery.md @@ -22,7 +22,7 @@ Examples: - **Game Playing and Strategy Generation:** Agents explore game states, discovering emergent strategies or identifying vulnerabilities in game environments (e.g., AlphaGo). - 游戏和策略生成:智能体探索游戏状态,发现涌现策略或识别游戏环境中的漏洞(例如 AlphaGo)。 + 游戏博弈与策略生成:智能体探索游戏状态,发现涌现策略或识别游戏环境中的漏洞(例如 AlphaGo)。 - **Market Research and Trend Spotting:** Agents scan unstructured data (social media, news, reports) to identify trends, consumer behaviors, or market opportunities. @@ -102,7 +102,7 @@ The system follows an iterative "generate, debate, and evolve" approach mirrorin **Automated and Expert Evaluation:** On the challenging GPQA benchmark, the system's internal Elo rating was shown to be concordant with the accuracy of its results, achieving a top-1 accuracy of 78.4% on the difficult "diamond set". Analysis across over 200 research goals demonstrated that scaling test-time compute consistently improves the quality of hypotheses, as measured by the Elo rating. On a curated set of 15 challenging problems, the AI co-scientist outperformed other state-of-the-art AI models and the "best guess" solutions provided by human experts. In a small-scale evaluation, biomedical experts rated the co-scientist's outputs as more novel and impactful compared to other baseline models. The system's proposals for drug repurposing, formatted as NIH Specific Aims pages, were also judged to be of high quality by a panel of six expert oncologists. -自动化和专家评估:在具有挑战性的 GPQA 基准测试中,该系统的内部 Elo 评级与其结果的准确性一致,在困难的「钻石集」上达到了 78.4% 的 top-1 准确率。对超过 200 个研究目标的分析表明,扩展测试时计算可以持续提高假设质量(通过 Elo 评级衡量)。在精心策划的 15 个具有挑战性的问题集上,AI 协作科学家的表现优于其他最先进的 AI 模型和人类专家提供的「最佳猜测」解决方案。在小规模评估中,生物医学专家认为协作科学家的输出比其他基线模型更新颖、更具影响力。该系统提出的药物再利用提案(格式化为 NIH 特定目标页面)也被六位肿瘤学专家小组评为高质量。 +自动化和专家评估:在具有挑战性的 GPQA 基准测试中,该系统的内部 Elo 评级与其结果的准确性一致,在困难的「钻石集」上达到了 78.4% 的 top-1 准确率。对超过 200 个研究目标的分析表明,扩展测试时计算可以持续提高假设质量(通过 Elo 评级衡量)。在精心策划的 15 个具有挑战性的问题集上,AI 协作科学家的表现优于其他最先进的 AI 模型和人类专家提供的「最佳猜测」解决方案。在小规模评估中,生物医学专家认为协作科学家的输出比其他基线模型更新颖、更具影响力。该系统提出的药物重定位提案(格式化为 NIH 特定目标页面)也被六位肿瘤学专家小组评为高质量。 **End-to-End Experimental Validation:** @@ -199,7 +199,7 @@ class ReviewersAgent: The judgment agents are designed with a specific prompt that closely emulates the cognitive framework and evaluation criteria typically employed by human reviewers. This prompt guides the agents to analyze outputs through a lens similar to how a human expert would, considering factors like relevance, coherence, factual accuracy, and overall quality. By crafting these prompts to mirror human review protocols, the system aims to achieve a level of evaluative sophistication that approaches human-like discernment. -判断智能体设计了特定的提示词,紧密模拟人类评审者通常采用的认知框架和评估标准。该提示词指导智能体通过类似于人类专家的视角来分析输出,考虑相关性、连贯性、事实准确性和整体质量等因素。通过设计这些提示词以镜像人类评审协议,该系统旨在达到接近人类辨别力的评估复杂程度。 +判断智能体被设计为使用特定的提示词,紧密模拟人类评审者通常采用的认知框架和评估标准。该提示词指导智能体通过类似于人类专家的视角来分析输出,考虑相关性、连贯性、事实准确性和整体质量等因素。通过设计这些提示词以镜像人类评审协议,该系统旨在达到接近人类辨别力的评估复杂程度。 ````python def get_score(outlined_plan, latex, reward_model_llm, reviewer_type=None, attempts=3, openai_api_key=None): From c45c619b634704f2aeb25d6f5030c291697093c6 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:21:26 +0800 Subject: [PATCH 11/17] Update routing section with improved clarity --- 08-Chapter-02-Routing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/08-Chapter-02-Routing.md b/08-Chapter-02-Routing.md index ad68a0b..b199925 100644 --- a/08-Chapter-02-Routing.md +++ b/08-Chapter-02-Routing.md @@ -86,7 +86,7 @@ In human-computer interaction, such as with virtual assistants or AI-driven tuto Within automated data and document processing pipelines, routing serves as a classification and distribution function. Incoming data, such as emails, support tickets, or API payloads, is analyzed based on content, metadata, or format. The system then directs each item to a corresponding workflow, such as a sales lead ingestion process, a specific data transformation function for JSON or CSV formats, or an urgent issue escalation path. -**数据处理流水线**:在自动化数据和文档处理流水线中,路由充当分类和分发功能。系统基于内容、元数据或格式对传入的数据(如电子邮件、支持工单或 API 负载)进行分析,然后将每项内容导向相应的工作流,例如销售线索处理流程、针对 JSON 或 CSV 格式的特定数据转换功能,或紧急问题升级路径。 +**数据处理流水线**:在自动化数据和文档处理流水线中,路由充当分类和分发功能。系统基于内容、元数据或格式对传入的数据(如电子邮件、支持工单或 API 负载)进行分析,然后将每项内容导向相应的工作流,例如销售线索录入流程、针对 JSON 或 CSV 格式的特定数据转换功能,或紧急问题升级路径。 In complex systems involving multiple specialized tools or agents, routing acts as a high-level dispatcher. A research system composed of distinct agents for searching, summarizing, and analyzing information would use a router to assign tasks to the most suitable agent based on the current objective. Similarly, an AI coding assistant uses routing to identify the programming language and user's intent—to debug, explain, or translate—before passing a code snippet to the correct specialized tool. @@ -460,7 +460,7 @@ The main function demonstrates the system's usage by running the coordinator wit **Rule of Thumb:** Use the Routing pattern when an agent must decide between multiple distinct workflows, tools, or sub-agents based on the user's input or the current state. It is essential for applications that need to triage or classify incoming requests to handle different types of tasks, such as a customer support bot distinguishing between sales inquiries, technical support, and account management questions. -适用场景:当智能体必须根据用户输入或当前状态在多个不同的工作流、工具或子智能体之间做出选择时,应使用路由模式。此模式对于需要对传入请求进行分类或分派以处理不同类型任务的应用至关重要,例如客户支持机器人需要区分销售咨询、技术支持和账户管理问题,并将每种类型的请求路由到相应的处理模块。 +经验法则:当智能体必须根据用户输入或当前状态在多个不同的工作流、工具或子智能体之间做出选择时,应使用路由模式。此模式对于需要对传入请求进行分类或分派以处理不同类型任务的应用至关重要,例如客户支持机器人需要区分销售咨询、技术支持和账户管理问题,并将每种类型的请求路由到相应的处理模块。 **Visual summary** | 可视化总结 From 96a6607f6f16de747edba484ff282c421165bbbc Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:23:05 +0800 Subject: [PATCH 12/17] Update use case description for planning agents --- 10-Chapter-04-Reflection.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/10-Chapter-04-Reflection.md b/10-Chapter-04-Reflection.md index d47893a..8a4c425 100644 --- a/10-Chapter-04-Reflection.md +++ b/10-Chapter-04-Reflection.md @@ -140,7 +140,7 @@ Evaluating a proposed plan and identifying potential flaws or improvements. - **Reflection:** Generate a plan, simulate its execution or evaluate its feasibility against constraints, revise the plan based on the evaluation. - **Benefit:** Develops more effective and realistic plans. -- 用例:规划一系列行动以实现特定任务的智能体。 +- 用例:规划一系列行动以实现特定目标的智能体。 - 反思:制定计划,模拟执行或根据限制评估可行性,然后根据评估结果对计划进行改进与调整。 - 好处:制定更有效、更符合实际的计划。 From 90cf07cfdf6a28a2cbaa0b9d43fbff72ac7bb99c Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:24:54 +0800 Subject: [PATCH 13/17] Refine Chinese translations in Planning.md Updated Chinese translations for clarity and accuracy in the documentation. --- 12-Chapter-06-Planning.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/12-Chapter-06-Planning.md b/12-Chapter-06-Planning.md index 16e45ff..266d5ca 100644 --- a/12-Chapter-06-Planning.md +++ b/12-Chapter-06-Planning.md @@ -50,7 +50,7 @@ In essence, the Planning pattern allows an agent to move beyond simple, reactive The following section will demonstrate an implementation of the Planner pattern using the Crew AI framework. This pattern involves an agent that first formulates a multi-step plan to address a complex query and then executes that plan sequentially. -接下来我们将演示如何使用 CrewAI 框架实现规划模式。该模式中,智能体先制定多步骤的计划来解决复杂问题,然后按步骤依次执行该计划。 +接下来我们将演示如何使用 CrewAI 框架实现规划模式。该模式中,智能体先制定多步骤的计划来解决复杂请求,然后按步骤依次执行该计划。 ```python import os @@ -180,7 +180,7 @@ The efficiency of this approach stems from the automation of the iterative searc The OpenAI Deep Research API is a specialized tool designed to automate complex research tasks. It utilizes an advanced, agentic model that can independently reason, plan, and synthesize information from real-world sources. Unlike a simple Q&A model, it takes a high-level query and autonomously breaks it down into sub-questions, performs web searches using its built-in tools, and delivers a structured, citation-rich final report. The API provides direct programmatic access to this entire process, using at the time of writing models like o3-deep-research-2025-06-26 for high-quality synthesis and the faster o4-mini-deep-research-2025-06-26 for latency-sensitive application -OpenAI 深度研究接口(OpenAI Deep Research API)是一款专为自动化复杂研究任务而设计的工具。它利用高级智能体模型,能够独立推理、规划,并从真实世界来源整合信息。不同于简单的问答模型,它接收高层次的问题并自主拆解为若干子问题,借助内置工具进行网络搜索,最终给出结构化且带有引用的报告。通过该接口可以用编程的方式控制整个流程。撰写本书时可使用 o3-deep-research-2025-06-26 模型生成高质量的调研内容,而 o4-mini-deep-research-2025-06-26 模型则可用于对延迟更敏感的场景。 +OpenAI 深度研究接口(OpenAI Deep Research API)是一款专为自动化复杂研究任务而设计的工具。它利用高级智能体模型,能够独立推理、规划,并从真实世界来源整合信息。不同于简单的问答模型,它接收高层次的问题并自主拆解为若干子问题,借助内置工具进行网络搜索,最终给出结构化且带有引用的报告。通过该接口可以用编程的方式控制整个流程。撰写本书时可使用 o3-deep-research-2025-06-26 模型生成高质量的调研内容,而 o4-mini-deep-research-2025-06-26 模型则适用于对延迟更敏感的场景。 The Deep Research API is useful because it automates what would otherwise be hours of manual research, delivering professional-grade, data-driven reports suitable for informing business strategy, investment decisions, or policy recommendations. Its key benefits include: From 7a08dd5cf9a25b9bbf2564b25827604d485b92fb Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:27:58 +0800 Subject: [PATCH 14/17] Update greeting from 'Hello World' to 'Goodbye World' --- 14-Chapter-08-Memory-Management.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/14-Chapter-08-Memory-Management.md b/14-Chapter-08-Memory-Management.md index 2927c15..1984266 100644 --- a/14-Chapter-08-Memory-Management.md +++ b/14-Chapter-08-Memory-Management.md @@ -16,7 +16,7 @@ In agent systems, memory refers to an agent's ability to retain and utilize info 上下文窗口的容量有限,限制了智能体可直接访问的近期信息范围。高效的短期记忆管理需要在有限空间内选择性地保留最相关信息,可通过总结旧对话片段或强调关键细节等技术实现。 -具有「长上下文」窗口的模型虽然扩大了短期记忆容量,允许在单次交互中保存更多信息,但这种上下文仍然是短暂的,会话结束后即丢失,且每次处理成本高昂、效率较低。 +具有「长上下文」窗口的模型虽然扩大了短期记忆容量,允许在单次交互中保存更多信息,但这种上下文仍然是临时的,会话结束后即丢失,且每次处理成本高昂、效率较低。 因此,智能体需要不同类型的记忆来实现真正的持久化,从过往交互中回忆信息并构建持久的知识库。 @@ -296,8 +296,7 @@ def log_user_login(tool_context: ToolContext) -> dict: return { "status": "success", - "message": f"User login tracked. Total logins: -{login_count}." + "message": f"User login tracked. Total logins: {login_count}." } # --- 使用演示 --- @@ -497,8 +496,7 @@ conversation = LLMChain(llm=llm, prompt=prompt, memory=memory) # 4. 运行对话 response = conversation.predict(question="I want to book a flight.") print(response) -response = conversation.predict(question="My name is Sam, by the -way.") +response = conversation.predict(question="My name is Sam, by the way.") print(response) response = conversation.predict(question="What was my name again?") print(response) @@ -786,3 +784,4 @@ This chapter dove into the really important job of memory management for agent s 3. Vertex AI Agent Engine Memory Bank: Vertex AI 智能体引擎的 Memory Bank: + From d3e7a3c144b75e32f1ff724d399539b7cd4bdf21 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:29:46 +0800 Subject: [PATCH 15/17] Update Chinese translations in Human-in-the-Loop chapter --- 19-Chapter-13-Human-in-the-Loop.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/19-Chapter-13-Human-in-the-Loop.md b/19-Chapter-13-Human-in-the-Loop.md index 3a67d2c..27fb853 100644 --- a/19-Chapter-13-Human-in-the-Loop.md +++ b/19-Chapter-13-Human-in-the-Loop.md @@ -58,7 +58,7 @@ The Human-in-the-Loop pattern is vital across a wide range of industries and app - **Customer Support (Complex Queries)**: A chatbot might handle routine customer inquiries. If the user's problem is too complex, emotionally charged, or requires empathy that the AI cannot provide, the conversation is seamlessly handed over to a human support agent. - 客户支持(复杂咨询):聊天机器人可以处理常规的客户咨询。如果用户的问题过于复杂、情绪过于激动,或者需要 AI 无法提供的情感共鸣时,需要将对话转接给服务专家进行处理。 + 客户支持(复杂咨询):聊天机器人可以处理常规的客户咨询。如果用户的问题过于复杂、情绪过于激动,或者需要 AI 无法提供的情感共鸣时,需要将对话转接给人工坐席进行处理。 - **Data Labeling and Annotation**: AI models often require large datasets of labeled data for training. Humans are put in the loop to accurately label images, text, or audio, providing the ground truth that the AI learns from. This is a continuous process as models evolve. @@ -86,7 +86,7 @@ This pattern exemplifies a practical method for AI implementation. It harnesses - **Modern call center**: In this setup, a human manager establishes high-level policies for customer interactions. For instance, the manager might set rules such as "any call mentioning 'service outage' should be immediately routed to a technical support specialist," or "if a customer's tone of voice indicates high frustration, the system should offer to connect them directly to a human agent." The AI system then handles the initial customer interactions, listening to and interpreting their needs in real-time. It autonomously executes the manager's policies by instantly routing the calls or offering escalations without needing human intervention for each individual case. This allows the AI to manage the high volume of immediate actions according to the slower, strategic guidance provided by the human operator. - 现代化呼叫中心:在此场景中,经理为客户互动建立高级策略。例如,经理可能设定规则,如「任何提到服务中断的电话应立即转接给技术支持专员」,或者「如果客户的语气表现出高度沮丧,系统应主动提出将他们转接给人工坐席。」然后,AI 系统会处理最初的客户互动,实时倾听并解释他们的需求。它通过即时转接电话或提供上报选项来自主执行经理的策略,而无需为每个个案都进行人工干预。这使得 AI 能够根据人类操作员提供的较慢、战略性的指导来管理大量的即时行动。 + 现代化呼叫中心:在此场景中,经理为客户互动建立高级策略。例如,经理可能设定规则,如「任何提到服务中断的电话应立即转接给技术支持专员」,或者「如果客户的语气表现出高度沮丧,系统应主动提出将他们转接给人工坐席。」然后,AI 系统会处理最初的客户互动,实时倾听并解释他们的需求。它通过即时转接电话或提供升级服务选项来自主执行经理的策略,而无需为每个个案都进行人工干预。这使得 AI 能够根据人类操作员提供的较慢、战略性的指导来管理大量的即时行动。 ## Hands-On Code Example | 实战代码示例 From 72111fe1b7f5e51c54dedffb24071b8bd5b6bee4 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:33:18 +0800 Subject: [PATCH 16/17] Refine message attributes and interaction methods description Updated the description of message attributes and interaction mechanisms in the A2A framework. --- 21-Chapter-15-Inter-Agent-Communication.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/21-Chapter-15-Inter-Agent-Communication.md b/21-Chapter-15-Inter-Agent-Communication.md index cfd31e9..cbf6ca0 100644 --- a/21-Chapter-15-Inter-Agent-Communication.md +++ b/21-Chapter-15-Inter-Agent-Communication.md @@ -139,13 +139,13 @@ In the A2A framework, communication is structured around asynchronous tasks, whi This communication contains attributes, which are key-value metadata describing the message (like its priority or creation time), and one or more parts, which carry the actual content being delivered, such as plain text, files, or structured JSON data. The tangible outputs generated by an agent during a task are called artifacts. Like messages, artifacts are also composed of one or more parts and can be streamed incrementally as results become available. All communication within the A2A framework is conducted over HTTP(S) using the JSON-RPC 2.0 protocol for payloads. To maintain continuity across multiple interactions, a server-generated contextId is used to group related tasks and preserve context. -消息包含 attributes 和一个或多个 partattributes 是描述消息的键值元数据, 如其优先级或创建时间。part 承载实际交付的内容, 如纯文本、文件或结构化 JSON 数据。智能体在任务期间生成的具体输出被称为 artifacts。与消息类似, artifacts 也由一个或多个部分组成, 并且可以在结果可用时以增量方式流式传输。A2A 框架内的所有通信都通过 HTTP(S) 进行, 并使用 JSON-RPC 2.0 协议作为负载。为了在多次交互中保持连续性, 使用服务器生成的 contextId 来对相关任务进行分组并保留上下文。 +消息包含属性(attributes)和一个或多个部分(part)。属性(attributes是描述消息的键值元数据, 如其优先级或创建时间。部分(part承载实际交付的内容, 如纯文本、文件或结构化 JSON 数据。智能体在任务期间生成的具体输出被称为工件(artifacts)。与消息类似, 工件(artifacts)也由一个或多个部分组成, 并且可以在结果可用时以增量方式流式传输。A2A 框架内的所有通信都通过 HTTP(S) 进行, 并使用 JSON-RPC 2.0 协议作为负载。为了在多次交互中保持连续性, 使用服务器生成的上下文ID(contextId)来对相关任务进行分组并保留上下文。 ## Interaction Mechanisms | 交互机制 Request/Response (Polling) Server-Sent Events (SSE). A2A provides multiple interaction methods to suit a variety of AI application needs, each with a distinct mechanism: -A2A 提供了多种交互方法以适应各种 AI 应用需求, 每种方法都有其独特的机制: +A2A 提供了多种交互方法以适应各种 AI 应用需求, 每种方法都有其独特的机制: - Synchronous Request/Response: For quick, immediate operations. In this model, the client sends a request and actively waits for the server to process it and return a complete response in a single, synchronous exchange. - 同步请求/响应: 用于快速、即时的操作。在这种模型中, 客户端发送请求并主动等待服务器处理, 服务器在单个同步交换中返回完整响应。 @@ -462,4 +462,4 @@ The Inter-Agent Communication (A2A) protocol establishes a vital, open standard 4. Getting Started with Agent-to-Agent (A2A) Protocol: [https://codelabs.developers.google.com/intro-a2a-purchasing-concierge\#0](https://codelabs.developers.google.com/intro-a2a-purchasing-concierge#0) 5. Google Agent Discovery - [https://a2a-protocol.org/latest/](https://a2a-protocol.org/latest/) 6. Communication between different AI frameworks such as LangGraph, CrewAI, and Google ADK [https://www.trickle.so/blog/how-to-build-google-a2a-project](https://www.trickle.so/blog/how-to-build-google-a2a-project) -7. Designing Collaborative Multi-Agent Systems with the A2A Protocol [https://www.oreilly.com/radar/designing-collaborative-multi-agent-systems-with-the-a2a-protocol/](https://www.oreilly.com/radar/designing-collaborative-multi-agent-systems-with-the-a2a-protocol/) \ No newline at end of file +7. Designing Collaborative Multi-Agent Systems with the A2A Protocol [https://www.oreilly.com/radar/designing-collaborative-multi-agent-systems-with-the-a2a-protocol/](https://www.oreilly.com/radar/designing-collaborative-multi-agent-systems-with-the-a2a-protocol/) From ca9730ac5941aacf14078a5807c8c878f5858e70 Mon Sep 17 00:00:00 2001 From: Y <53069671+u3588064@users.noreply.github.com> Date: Mon, 17 Nov 2025 22:34:06 +0800 Subject: [PATCH 17/17] Fix formatting and enhance clarity in reasoning techniques Updated the text to correct formatting and improve clarity. --- 23-Chapter-17-Reasoning-Techniques.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/23-Chapter-17-Reasoning-Techniques.md b/23-Chapter-17-Reasoning-Techniques.md index dd3c476..5d7b5d7 100644 --- a/23-Chapter-17-Reasoning-Techniques.md +++ b/23-Chapter-17-Reasoning-Techniques.md @@ -412,7 +412,7 @@ Fig. 5: Google Deep Research for Information Gathering A fundamental shift introduced by these tools is the change in the search process itself. A standard search provides immediate links, leaving the work of synthesis to you. Deep Research operates on a different model. Here, you task an AI with a complex query and grant it a "time budget"—usually a few minutes. In return for this patience, you receive a detailed report. -这些工具带来的一个根本性转变是搜索过程本身的改变。标准搜索会立即提供链接,将综合整理的工作留给你。而深度研究则采用不同的模式。在这里,你给 AI 分配一个复杂的查询任务,并授予它一个「时间预算」——通常是几分钟。作为这种耐心的回报,你将收到一份详细的报告*。 +这些工具带来的一个根本性转变是搜索过程本身的改变。标准搜索会立即提供链接,将综合整理的工作留给你。而深度研究则采用不同的模式。在这里,你给 AI 分配一个复杂的查询任务,并授予它一个「时间预算」——通常是几分钟。作为这种耐心的回报,你将收到一份详细的报告。 During this time, the AI works on your behalf in an agentic way. It autonomously performs a series of sophisticated steps that would be incredibly time-consuming for a person: