pythoninchemistry
diff --git a/‎CH40208/_toc.yml‎
Lines changed: 16 additions & 0 deletions b/‎CH40208/_toc.yml‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎CH40208/course_contents/week_6.md‎
Lines changed: 2 additions & 0 deletions b/‎CH40208/course_contents/week_6.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎CH40208/model_fitting/exercises.ipynb‎
Lines changed: 56 additions & 33 deletions b/‎CH40208/model_fitting/exercises.ipynb‎
Lines changed: 56 additions & 33 deletions
@@ -70,6 +70,8 @@ parts:
         - file: model_fitting/empirical_and_theoretical_models.md
         - file: model_fitting/model_complexity.md
         - file: model_fitting/finding_the_best_fit.md
+        - file: model_fitting/understanding_residuals
+        - file: model_fitting/fitting_with_minimize
         - file: model_fitting/exercises.ipynb
 ##  - file: working_with_data/ideal_gas_law
 #      - file: working_with_data/curve_fitting
@@ -155,3 +157,17 @@ parts:
           title: NumPy
         - file: worked_examples/week_3_synoptic_exercises
           title: Synoptic Exercises
+      - file: worked_examples/week_5/week_5_index
+        title: Week 5
+        sections:
+        - file: worked_examples/week_5/grid_search_method
+          title: Grid Search
+        - file: worked_examples/week_5/gradient_descent_method
+          title: Gradient Descent
+        - file: worked_examples/week_5/newton_raphson_method
+          title: Newton-Raphson
+        - file: worked_examples/week_5/Lennard_Jones_optimisation
+          title: Lennard-Jones optimisation
+        - file: worked_examples/week_5/scipy_optimize_minimize
+          title: scipy.optimize.minimize
+
@@ -7,4 +7,6 @@
     - [Empirical and Theoretical Models](../model_fitting/empirical_and_theoretical_models.md)
     - [Model Complexity](../model_fitting/model_complexity.md)
     - [Finding the Best Fit](../model_fitting/finding_the_best_fit.md)
+    - [Understanding Residuals and Quality of Fit](../model_fitting/understanding_residuals.ipynb)
+    - [Linear Model Fitting with scipy.optimize.minimize()](../model_fitting/fitting_with_minimize.ipynb)
     - [Exercises](../model_fitting/exercises.ipynb)
@@ -7,6 +7,8 @@
    "source": [
     "# Exercises\n",
     "\n",
+    "You have now worked through the complete workflow for fitting linear models to data. The exercises here give you practice applying this workflow to real chemical systems, where the context adds physical interpretation and requires data handling skills like file reading and coordinate transformations.\n",
+    "\n",
     "## 1. Linear fitting of the Van't Hoff equation\n",
     "\n",
     "The Van’t Hoff equation relates the change in the equilibrium constant, $K$, for a reaction, to a change in temperature, $T$, given that the enthalpy change for the reaction, $\\Delta H$, is constant over the temperature range of interest.\n",
@@ -37,7 +39,9 @@
    "id": "7d286fc9-c672-492d-af95-037d818e3968",
    "metadata": {},
    "source": [
-    "**(a)** The data file [exercise_1.dat](data/exercise_1.dat) gives the following experimentally measured equilibrium constants at different temperatures for the reaction\n",
+    "### (a) Read the data\n",
+    "\n",
+    "The data file [exercise_1.dat](data/exercise_1.dat) gives the following experimentally measured equilibrium constants at different temperatures for the reaction\n",
     "\n",
     "$$2\\mathrm{NO}_2 \\rightleftharpoons 2\\mathrm{N}_2\\mathrm{O}_4$$\n",
     "\n",
@@ -60,7 +64,9 @@
    "id": "3e6cf158-eb1a-4c24-9c07-bc046f360d54",
    "metadata": {},
    "source": [
-    "**(b)** Convert your raw data into $1/T$ (in inverse Kelvin) and $\\ln K$.\n",
+    "### (b) Transform and plot the data\n",
+    "\n",
+    "Convert your raw data into $1/T$ (in inverse Kelvin) and $\\ln K$.\n",
     "\n",
     "Plot these transformed data and confirm that they approximately show a linear relationship."
    ]
@@ -70,51 +76,39 @@
    "id": "6bafffeb-00b4-4414-b7eb-a158acb10ecc",
    "metadata": {},
    "source": [
-    "**(c)** Write a function called `linear_model()` with arguments $x$, $m$, and $c$, that returns the corresponding $y$ value for the function\n",
-    "\n",
-    "$$y = mx + c$$"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "23bbafeb-c1ad-44c7-beaa-05836a84e64a",
-   "metadata": {},
-   "source": [
-    "**(d)** Replot your raw $1/T$ versus $\\ln K$ data and show that for $m$ = 7400 and $c$ = -22.5 your model approximately described your data."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a7465c35",
-   "metadata": {},
-   "source": [
-    "**(e)** Write a function `error_function()` that takes as arguments a list of model parameters, a list of $x$ values, and a list of observed $y$ values. This function should call your `linear_model()` function to calculate the sum-of-squares error for a given pair of model parameters, $m$ and $c$.\n",
+    "### (c) Define model and error functions\n",
     "\n",
-    "$$\\chi^2 = \\sum_i\\left[y_i - f(x_i)\\right]^2$$"
+    "You have already written functions for a linear model and error function in the previous section. Write similar functions here, or adapt your previous code. Test your model by plotting the data alongside the model prediction for $m$ = 7400 and $c$ = -22.5 to verify your model approximately describes the data."
    ]
   },
   {
    "cell_type": "markdown",
    "id": "cae94c54",
    "metadata": {},
    "source": [
-    "**(f)** Use `scipy.optimize.minimize()` to find the &ldquo;best fit&rdquo; values of $m$ and $c$ for your data. Use the previous values of $m$ = 7400 and $c$ = -22.5 as your starting guess."
+    "### (d) Find best-fit parameters using `minimize()`\n",
+    "\n",
+    "Use `scipy.optimize.minimize()` to find the &ldquo;best fit&rdquo; values of $m$ and $c$ for your data. Use the previous values of $m$ = 7400 and $c$ = −22.5 as your starting guess."
    ]
   },
   {
    "cell_type": "markdown",
    "id": "ac96378e",
    "metadata": {},
    "source": [
-    "**(g)** Replot your data, now including your &ldquo;line of best fit&rdquo;"
+    "### (e) Plot the fitted model\n",
+    "\n",
+    "Plot the fitted model Replot your data, now including your &ldquo;line of best fit&rdquo;"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "a1fd5beb",
    "metadata": {},
    "source": [
-    "**(h)** One way to perform linear regression in Python without writing your own model function and error function is to use `scipy.stats.linregress()`:\n",
+    "### (f) Compare with `scipy.stats.linregress()`\n",
+    "\n",
+    "One way to perform linear regression in Python without writing your own model function and error function is to use `scipy.stats.linregress()`:\n",
     "\n",
     "```python\n",
     "from scipy.stats import linregress\n",
@@ -157,21 +151,33 @@
    "id": "e693f423",
    "metadata": {},
    "source": [
-    "**(a)** Read these data to numpy arrays, and plot time versus the concentration of H<sub>2</sub>O<sub>2</sub>.\n",
+    "### (a) Read and plot the data\n",
+    "\n",
+    "Read these data to numpy arrays, and plot time versus the concentration of H<sub>2</sub>O<sub>2</sub>.\n",
     "\n",
-    "**(b)** This reaction has a first-order rate equation, which will be our \"model\" for fitting the data:\n",
+    "### (b) Define the kinetic model function\n",
+    "\n",
+    "This reaction has a first-order rate equation, which will be our \"model\" for fitting the data:\n",
     "\n",
     "   $$[\\mathrm{A}](t) = [\\mathrm{A}]_0\\exp(-kt)$$ \n",
     "\n",
     "   Write a function `first_order()` to calculate this $[\\mathrm{A}](t)$ as a function of $t$, $[\\mathrm{A}]_0$, and $k$.\n",
     "\n",
-    "**(c)** Write a function `error_function()` that calculates the least-squares error for your first-order kinetic model compared to the experimental data.\n",
+    "### (c) Write an error function\n",
+    "\n",
+    "Write a function `error_function()` that calculates the least-squares error for your first-order kinetic model compared to the experimental data.\n",
+    "\n",
+    "### (d) Find best-fit parameters using minimize()\n",
+    "\n",
+    "Use `scipy.optimize.minimize()` to find the best-fit values of $[\\mathrm{A}]_0$, and $k$. Use starting guesses of $k$ = 0.1 and $A$ = 7.\n",
     "\n",
-    "**(d)** Use `scipy.optimize.minimize()` to find the best-fit values of $[\\mathrm{A}]_0$, and $k$. Use starting guesses of $k$ = 0.1 and $A$ = 7.\n",
+    "### (e) Plot the fitted model\n",
     "\n",
-    "**(e)** Replot the raw data, plus the curve for your best-fit parameters.\n",
+    "Replot the raw data, plus the curve for your best-fit parameters.\n",
     "\n",
-    "**(f)** You can also perform non-linear least-squares fitting, without having to define an error function, using `scipy.optimize.curve_fit()`.\n",
+    "### (f) Use `scipy.optimize.curve_fit()`\n",
+    "\n",
+    "You can also perform non-linear least-squares fitting, without having to define an error function, using `scipy.optimize.curve_fit()`.\n",
     "\n",
     "```python\n",
     "from scipy.optimize import curve_fit\n",
@@ -184,7 +190,24 @@
     "\n",
     "`curve_fit()` returns two numpy arrays. The first array contains the optimised model parameters. The second array contains a 2D array that describes the *covariance* of these parameters. The diagonal elements of this array are estimated uncertainties (standard deviations) for each model parameter, and the off-diagonal elements give information about the degree of correlation between the model parameters.\n",
     "\n",
-    "Use `curve_fit()` to confirm your best fit model parameters for your first-order kinetic model. You will need to pass in as arguments your model function, the observed $x$ data (time), and the observed $y$ data (H<sub>2</sub>O<sub>2</sub> concentration)."
+    "Use `curve_fit()` to confirm your best fit model parameters for your first-order kinetic model. You will need to pass in as arguments your model function, the observed $x$ data (time), and the observed $y$ data (H<sub>2</sub>O<sub>2</sub> concentration).\n",
+    "\n",
+    "### (g) Optional: Using linearization to find initial guesses\n",
+    "\n",
+    "Non-linear fitting can sometimes be sensitive to initial parameter guesses, particularly for exponential models. One useful strategy when you encounter convergence difficulties is to linearise the model first.\n",
+    "\n",
+    "For a first-order kinetic model, taking the natural logarithm of both sides gives:\n",
+    "\n",
+    "$$\\ln[\\mathrm{A}](t) = \\ln[\\mathrm{A}]_0 - kt$$\n",
+    "\n",
+    "This is linear in $t$ with slope $-k$ and intercept $\\ln[\\mathrm{A}]_0$.\n",
+    "\n",
+    "1. Transform your data: calculate $\\ln[\\mathrm{H}_2\\mathrm{O}_2]$\n",
+    "2. Use `scipy.stats.linregress()` to fit this linearised form\n",
+    "3. Extract approximate values for $k$ and $[\\mathrm{A}]_0$ from the fit\n",
+    "4. Use these as initial guesses in your non-linear fit from part (d)\n",
+    "\n",
+    "Compare the fitted parameters from the linearised fit with those from the non-linear fit. You should find they are slightly different. The non-linear fit (fitting the exponential model directly) is generally preferred and gives more accurate parameter estimates. The linearised fit is more numerically stable with respect to the starting guess though, so is useful primarily as a way to obtain good initial guesses for the non-linear fit."
    ]
   },
   {
@@ -212,7 +235,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.1"
+   "version": "3.12.9"
   }
  },
  "nbformat": 4,