Skip to content

09.4 Google CoLab

Chris Swain edited this page Apr 27, 2021 · 1 revision

9.4 Google CoLab:

As running docking experiments on the UCL cluster can be restricted by VPN connection, Google CoLab can be used as an alternative platform where students can run SMINA docking scripts. Like the UCL cluster, CoLab allows you to write and execute Python notebooks in a web browser, with zero configuration required and free access to GPUs. The procedure is very similar to how we run Jupyter Notebooks on the cluster.

Firstly, you need to log on to the CoLab website with your google account (https://colab.research.google.com/notebooks/intro.ipynb) and you will see the following page in your browser. To upload the notebook (DockingCoLab.ipynb), please go File > Upload notebook.

After that, you need to go to the orange folder (bottom left) and upload the rest of the files by clicking on the icon highlighted in the red box.

Then, you should make sure all the variables are set, all the files you are going to use are correctly named and all of their corresponding outputs are properly saved with a name.

Here, you need type in the name of your ligand file (e.g.“asinexSelectionexport.sdf”) and the name of its conformation file (e.g. “asinexSelectionForDocking.sdf”) where the red box highlighted.

You can decide how many conformations you want to generate. The starting number is 5, but you can always increase it. Bear in mind that it might not always generate the maximum number of conformations that you inputted, for example, if you put n = 100, there may only be 23 generated. The number of conformations generated depends on the structural features of your ligand.

To use the structural files of your protein (“protein_minus_ligand.pdb”) and its originally bound ligand (“373ligand_only.pdb”), you need to refer to their file names properly (highlighted in the red box) so that the programme can recognise and run them. For rigid docking, the output will have a default name of “All_Docked.sdf.gz”.

The same rules of inputting files names (highlighted by the red box)are also applied to the Redocking section of the notebook. The target protein with its ligand should be used in this section and the redocking scores with predicted binding information will be listed right below it.

By default, all results will be combined and saved as “Alldata.sdf.gz", this file will appear and be ready to download from the left column once you have run through all the cells in the notebook.

Additional Note: you can run the script cell by cell by clicking on the arrow on each cell (in rectangle red box), or you could go Runtime > Run all (in square red box).

For more insights on how to use Google CoLab, check out the RSC CICAG virtual session by Jan Jensen: https://www.youtube.com/watch?v=KEIpJ50Jc0w

Clone this wiki locally