This workshop is a crash course designed for beginner python coders in the STEM field. The development of sequencing methods and decreasing costs leads to the need of scientists knowing how to code. Not only is it important for scientists to know how to develop and execute code but scientists may run into looking at another scientists code and being able to decipher what's going on for their own research benefit.
Google Colab is a great way to test and store code in a notebook fashion. In this workshop, we will use Jupyter Notebooks to store note keeping and develop code to finish exercises. All you need is access to the internet and your own gmail account. Get your toes wet while realizing that the Python Intepreter is just a fancy calculator where you can save and edit your work. As well as learn the simple datatypes: integers, floats, and strings. Further, define customized functions and resolve coding errors. This workshop ends with the development of your first block of code!
Notebook download:
Student_PART_1_Python3_Workshop-3.ipynb
The student will generate a block of code that takes in a DNA sequence of interest that was obtained from the database Uniprot and cross reference database. Define the following functions and return the following values: total Guanines, total Cytosines, total Adenines, and total Uridines, and the A+T/G+C ratio of the DNA sequence.
Part 2 - The use of booleans and conditional statements in for and while loops to automate repetitive tasks.
It is challenging to stray away from lists and dictionaries when analyzing data. In this lesson, the student will generate lists and dictionaries, as well as, indexing and calling keys from these lists and dictionaries. Using booleans and conditionals will become second nature in the student when using for and while loops to analyze data that is stored in lists and dictionaries.
Notebook download:
Student_PART_2_NEW_Python3_Workshop_v2.ipynb
The student will define from scratch familiar functions in Python. Also, the student will use a loop to find the total differences between two DNA sequences. Finally, the student will have a list of sequences and return properties of the sequences (i.e. length, transcript of the protein-coding gene, and the amino acid sequence)
The final part of our Python crash course will prepare students to read, create, and manipulate text, csv, and FASTA files, as well as, learn about importing Python libraries such as Pandas!
Notebook download:
Student_PART_3_Python3_Workshop.ipynb
Students will develop a piece of code that (1) reads in a FASTA file, (2) create a Pandas dataframe that stores multiple characteristics of the sequences found in the FASTA file, and (3) export the Pandas dataframe as a CSV file.