stanfordnlp · okhat · Aug 10, 2025 · Jul 21, 2025
diff --git a/docs/docs/api/optimizers/MIPROv2.md b/docs/docs/api/optimizers/MIPROv2.md
@@ -64,6 +64,6 @@ These steps are broken down in more detail below:
 
 2) **Propose Instruction Candidates**. The instruction proposer includes (1) a generated summary of properties of the training dataset, (2) a generated summary of your LM program's code and the specific predictor that an instruction is being generated for, (3) the previously bootstrapped few-shot examples to show reference inputs / outputs for a given predictor and (4) a randomly sampled tip for generation (i.e. "be creative", "be concise", etc.) to help explore the feature space of potential instructions.  This context is provided to a `prompt_model` which writes high quality instruction candidates.
 
-3) **Find an Optimized Combination of Few-Shot Examples & Instructions**. Finally, we use Bayesian Optimization to choose which combinations of instructions and demonstrations work best for each predictor in our program. This works by running a series of `num_trials` trials, where a new set of prompts are evaluated over our validation set at each trial. The new set of prompts are only evaluated on a minibatch of size `minibatch_size` at each trial (when `minibatch`=`True`). The best averaging set of prompts is then evalauted on the full validation set every `minibatch_full_eval_steps`. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned.
+3) **Find an Optimized Combination of Few-Shot Examples & Instructions**. Finally, we use Bayesian Optimization to choose which combinations of instructions and demonstrations work best for each predictor in our program. This works by running a series of `num_trials` trials, where a new set of prompts are evaluated over our validation set at each trial. The new set of prompts are only evaluated on a minibatch of size `minibatch_size` at each trial (when `minibatch`=`True`). The best averaging set of prompts is then evaluated on the full validation set every `minibatch_full_eval_steps`. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned.
 
 For those interested in more details, more information on `MIPROv2` along with a study on `MIPROv2` compared with other DSPy optimizers can be found in [this paper](https://arxiv.org/abs/2406.11695).
diff --git a/docs/docs/learn/evaluation/overview.md b/docs/docs/learn/evaluation/overview.md
@@ -4,7 +4,7 @@ sidebar_position: 1
 
 # Evaluation in DSPy
 
-Once you have an initial system, it's time to **collect an initial development set** so you can refine it more systematically. Even 20 input examples of your task can be useful, though 200 goes a long way. Depending on your _metric_, you either just need inputs and no labels at all, or you need inputs and the _final_ outputs of your system. (You almost never need labels for the intermediate steps in your program in DSPy.) You can probably find datasets that are adjacent to your task on, say, HuggingFace datasets or in a naturally occuring source like StackExchange. If there's data whose licenses are permissive enough, we suggest you use them. Otherwise, you can label a few examples by hand or start deploying a demo of your system and collect initial data that way.
+Once you have an initial system, it's time to **collect an initial development set** so you can refine it more systematically. Even 20 input examples of your task can be useful, though 200 goes a long way. Depending on your _metric_, you either just need inputs and no labels at all, or you need inputs and the _final_ outputs of your system. (You almost never need labels for the intermediate steps in your program in DSPy.) You can probably find datasets that are adjacent to your task on, say, HuggingFace datasets or in a naturally occurring source like StackExchange. If there's data whose licenses are permissive enough, we suggest you use them. Otherwise, you can label a few examples by hand or start deploying a demo of your system and collect initial data that way.
 
 Next, you should **define your DSPy metric**. What makes outputs from your system good or bad? Invest in defining metrics and improving them incrementally over time; it's hard to consistently improve what you aren't able to define. A metric is a function that takes examples from your data and takes the output of your system, and returns a score. For simple tasks, this could be just "accuracy", e.g. for simple classification or short-form QA tasks. For most applications, your system will produce long-form outputs, so your metric will be a smaller DSPy program that checks multiple properties of the output. Getting this right on the first try is unlikely: start with something simple and iterate.
 

diff --git a/docs/docs/tutorials/custom_module/index.ipynb b/docs/docs/tutorials/custom_module/index.ipynb
@@ -70,7 +70,7 @@
         "id": "DziTWwT8_TrY"
       },
       "source": [
-        "Let's illustrate this with a practical code example. We will build a simple Retrieval-Augmented Generation (RAG) application with mulitple stages:\n",
+        "Let's illustrate this with a practical code example. We will build a simple Retrieval-Augmented Generation (RAG) application with multiple stages:\n",
         "\n",
         "1.  **Query Generation:** Generate a suitable query based on the user's question to retrieve relevant context.\n",
         "2.  **Context Retrieval:** Fetch context using the generated query.\n",

diff --git a/docs/docs/tutorials/customer_service_agent/index.ipynb b/docs/docs/tutorials/customer_service_agent/index.ipynb
@@ -115,7 +115,7 @@
       "source": [
         "### Create Dummy Data\n",
         "\n",
-        "Let's also create some dummy data so that the airline agent can do the work. We need to create a few flights and a few users, and initilize empty dictionaries for the itineraries and custom support tickets."
+        "Let's also create some dummy data so that the airline agent can do the work. We need to create a few flights and a few users, and initialize empty dictionaries for the itineraries and custom support tickets."
       ]
     },
     {
@@ -295,11 +295,11 @@
       "source": [
         "import dspy\n",
         "\n",
-        "class DSPyAirlineCustomerSerice(dspy.Signature):\n",
+        "class DSPyAirlineCustomerService(dspy.Signature):\n",
         "    \"\"\"You are an airline customer service agent that helps user book and manage flights.\n",
         "\n",
         "    You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-        "    fullfil users' request.\"\"\"\n",
+        "    fulfill users' request.\"\"\"\n",
         "\n",
         "    user_request: str = dspy.InputField()\n",
         "    process_result: str = dspy.OutputField(\n",
@@ -319,7 +319,7 @@
       "outputs": [],
       "source": [
         "agent = dspy.ReAct(\n",
-        "    DSPyAirlineCustomerSerice,\n",
+        "    DSPyAirlineCustomerService,\n",
         "    tools = [\n",
         "        fetch_flight_info,\n",
         "        fetch_itinerary,\n",
@@ -497,7 +497,7 @@
             "        You are an airline customer service agent that helps user book and manage flights. \n",
             "        \n",
             "        You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-            "        fullfil users' request.\n",
+            "        fulfill users' request.\n",
             "        \n",
             "        You are an Agent. In each episode, you will be given the fields `user_request` as input. And you can see your past trajectory so far.\n",
             "        Your goal is to use one or more of the supplied tools to collect any necessary information for producing `process_result`.\n",
@@ -579,7 +579,7 @@
             "        You are an airline customer service agent that helps user book and manage flights. \n",
             "        \n",
             "        You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-            "        fullfil users' request.\n",
+            "        fulfill users' request.\n",
             "        \n",
             "        You are an Agent. In each episode, you will be given the fields `user_request` as input. And you can see your past trajectory so far.\n",
             "        Your goal is to use one or more of the supplied tools to collect any necessary information for producing `process_result`.\n",
@@ -672,7 +672,7 @@
             "        You are an airline customer service agent that helps user book and manage flights. \n",
             "        \n",
             "        You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-            "        fullfil users' request.\n",
+            "        fulfill users' request.\n",
             "        \n",
             "        You are an Agent. In each episode, you will be given the fields `user_request` as input. And you can see your past trajectory so far.\n",
             "        Your goal is to use one or more of the supplied tools to collect any necessary information for producing `process_result`.\n",
@@ -777,7 +777,7 @@
             "        You are an airline customer service agent that helps user book and manage flights. \n",
             "        \n",
             "        You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-            "        fullfil users' request.\n",
+            "        fulfill users' request.\n",
             "        \n",
             "        You are an Agent. In each episode, you will be given the fields `user_request` as input. And you can see your past trajectory so far.\n",
             "        Your goal is to use one or more of the supplied tools to collect any necessary information for producing `process_result`.\n",
@@ -894,7 +894,7 @@
             "        You are an airline customer service agent that helps user book and manage flights. \n",
             "        \n",
             "        You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-            "        fullfil users' request.\n",
+            "        fulfill users' request.\n",
             "        \n",
             "        You are an Agent. In each episode, you will be given the fields `user_request` as input. And you can see your past trajectory so far.\n",
             "        Your goal is to use one or more of the supplied tools to collect any necessary information for producing `process_result`.\n",
@@ -1019,7 +1019,7 @@
             "        You are an airline customer service agent that helps user book and manage flights. \n",
             "        \n",
             "        You are given a list of tools to handle user request, and you should decide the right tool to use in order to\n",
-            "        fullfil users' request.\n",
+            "        fulfill users' request.\n",
             "\n",
             "\n",
             "\u001b[31mUser message:\u001b[0m\n",