Overview
Course objectives
Our fundamental goal is to ensure you have the necessary bioinformatics skills, both in terms of tools and underlying analytic approaches,in order to fully participate in modern life science approaches.
Biology is becoming increasingly quantitative.
Instructor
Mike Hallett
GE120.07
Office TuTh 12-13
hallett.mike.t@gmail.com
@hallettmichael
Course
TuTh
September 8–December 7, 2020
13:15-14:30
Zoom
Slack
Github repo
Lectures
RStudio Cloud
Labs
Aki Kirbizakis
e.kirbizakis@gmail.com
Tu 10:15-11:30
Zoom
Wed 10-11
There are many reasons for this including the development of high-throughput profiling approaches. Nucleotide sequencing, protein/lipid/small molecule mass spectrometry, and cellular/subcellular microscopy, but the list of
new technologies continues to expand rapidly.
All areas of life science research from basic biolgy to human health-related efforts have been fundamentally changed by this influx of technology.
[Data Science] A single -omics experiment now can produce a staggering amount of data which would be impossible to sift through using only human hands and eyes. The ability to develop software and analytic approaches to wrangle with this data, clean it up, kick it into shape and visualize it in a way that is both informative and honest is super important skill set to have as a modern life science student. This course will give you the fundamentals of data science in the context of many examples including metagenomics of ocean ecologies and transcriptomics of breast tumors.
[Bioinformatics] The ability to profile complete genomes, microbiomes, proteomes, interactomes and other “-omes” creates a need for computational infrastucture to capture and organize this data. This is often referred to as bioinformatics, and is focused on the development of tools, portals and databases to make this information available to all life scientists. This course will give many examples primarily centered on the resource available at the NCBI.
[Computational Biology] Computational biology concerns itself with the development of new analytic techniques, typically expressed as computer programs, to explore data. For example, if you were generating many high-content microscopy images and were interested identifying specific events (eg expression of a rare cellular or subcellular phenotype), you would benefit from developing a machine learning algorithm to sift through the gigabytes of images automatically. At the end of this course, you wil understand the basics of machine learning including (Hidden) Markov Models, linear and logistic regression, and deep convolution neural networks.
Course materials
Free on-line textbook Grolemund Wickham
[Text Book] Throughout much of the course, we will follow
- Hadley Wickham and Grolemund, R for Data Science (NZ: O’Reilly Publishers Inc., 2017), https://r4ds.had.co.nz/. (free) online and from Amazon]
However note that this is a general data science book and not specific to biology and the life sciences. For several lectures, I will provide additional reading to cover the biology here.
[Software] The course will heavily utilize several software packages
Google Gmail (free) For security reasons and for simplicity, we ask each student to obtain a gmail account for the course. You will need to send it to the biol480.concordia@gmail.com at the beginning of the course.
Google Drive (free) We use the cloud-based text editor, drawing tool, spreadsheets, and storage offered by Drive to create and share files.
YouTube (free) All lectures are online at our YouTube channel.
Slack (free)When you send me your gmail account, you will be added.
Zoom (free) All lecture slots and TA sessions are on our Zoom channel available at the link above. When you have given us your gmail account, we will add you to Slack and you will see the password for the Zoom channel.
RStudio Cloud (~$28.00) This is a beautiful Interactive Development Environment (IDE) for the language \({\tt R}\). It is based in the cloud, so you will not have to install any software. All of the lecture notes, code and datasets will be available to you at the website. We are hopeful currently that Concordia will cover this cost for each of you.
There are other ad hoc tools that we will use throughout the course, but they are all free.
Total cost for the course is ~$28.
[Hardware] All of the tools listed above are in the cloud. This means that you need minimal computing equipment. It will likely be very hard to program with only a tablet however. If you do not have a laptop or machine at home, you should contact IITS (they loan equipment for free) or the instructor (I have some Google Chrome notebooks). Note that you can purchase a Google Chrome notebook for under $300 on Amazon.
Evaluation
Up to 5 pts can be added to your score (out of 100) from bonus \(+1\) awarded to you by students, the instructor or TA. It goes like this: if someone
helps you very significantly with an assignment question or project,
then you inform the instructor
why that person deserves \(+1\), in a paragraph. Little puzzles in lectures and labs can also earn people \(+1\). Most \(+1\)s are earned however through answering each other’s questions in the Slack channel for the course.
You can find descriptions and instructions for each exercise on the assignments page.
[Grading Scheme Undergraduates]
Assignment | Points |
---|---|
Quizzes, Puzzles, Opinions (8 x 5 pts; top 6 chosen) | 30 |
Midterm 1 (take home 1 hour) | 7.5 |
Midterm 2 (take home 1 hour) | 7.5 |
Final Exam (take home 2 hours) | 15 |
Homework assignments (4 x 10) | 40 |
Total pts: 100
[Grading Scheme Genomics Diploma and Gradaute Students]
Assignment | Points |
---|---|
Quizzes, Puzzles, Opinions (8 x 5 pts; top 5 chosen) | 25 |
Midterm 1 (take home 1 hour) | 7.5 |
Midterm 2 (take home 1 hour) | 7.5 |
Final Exam (take home 2 hours) | 10 |
Homework assignments (4 x 10) | 40 |
Project | 10 |
Total pts: 100
A day in the life …
Deep learning-based tool to identify Candida albicans morphologies V Bettauer from Hallett lab
Ok so how does this work in our covid-19 reality? Here we go.
The video for each lecture is availalbe on our YouTube channel. You should watch the video before the actual lecture.
We have a Slack workspace and you can ask questions in the #biol480 channel. You can get access to Slack by sending your gmail account here: biol480.concordia. Slack is great for asking questions.
The lectures are via Zoom. The Zoom password is in the Slack channel. The lectures will be used to discuss the material you have read previously (see 1 above). Also, sometimes (8 in total) there will be small quizzess, or I will give a small puzzle, or ask for your opinion on a subject relevant to the material covered to date. These will last ~10 minutes. Otherwise, I will answer questions and provide more depth on specific aspects of the slides.
The labs are also via Zoom. The TA (Aki) will be happy to answer questions and assist you with the assignments.
Office hours are via Slack. I am actually accessible almost always via Slack and find this a more efficient way to communicate. Slack does have video conferncing too that is very easy to use, and is more suitable for one on one discussion.
Midterms and exams are take-home. You can make use of whatever resources you would like but you need to cite all such resources used. If you do not, it will be treated as academic dishonesty.
Submission of assignments, projects and other material will be discussed as we progress through the course.
Time management
As a rule of thumb per week (2 lectures/week),
- readings requires \(3\) hours;
- lecture on average \(2\) hours;
- lab \(1\) hour;
- assignment or project work \(3\) hours;
- Slack questions or office hours (TA or instructor) \(1\) hour;
- studying for midterms or exams \(1\) hour per week.
In total over \(13\) weeks, this consists \(13 \cdot 11 = 143\) (credits \(3\) at \(45\) hours each).
Course policies
Visualization of the Global Yeast Genetic Interaction Network Usaj et al.
Be nice and honest, defined at least partially here Academic Code of Conduct. This New York Times article seems appropriate in the covid-19 era: If everyone else is doing it …. You might check out the ethos statement my lab designed; it clarifies everyone’s roles and responsibilities.
Especially in these covid-19 times, things can be stressful. Concordia does have units that can help, ranging from financial assistance to mental health. A good place to start is Concordia’s covid-19 resource page.
In some exercises, we state explicitly that group work is encouraged. In all cases, you must cite whomever helped you and state precisely what they helped you with. If this is not done, this constitutes academic dishonesty. Generally however discussion and interaction is encouraged in the course.
It is challenging to interact via Zoom and Slack. Although not a rule, I prefer if your cameras are on so that we can speak face to face. If that is problematic (eg slow internet, bad hair day), turning the camera on when asking questions is a nice gesture.
Non-Concordia students and auditors
I am totally happy to allow you to audit or follow the course either in real time, or via the residual online resources. If you want to participate in our lectures, labs, office hours, you would need to have our Zoom access code, access to our Slack workspace and Google Drive resources. Please drop me an email.
Action items
Once you have read this entire overview, please email us at the course gmail biol480.concordia with your gmail account, last name, first name and student ID. Make sure it is easy for us to associate the name of your gmail account with your real name (if the username of your gmail account is e.g. “kingofeverything”, it might be hard for us to guess who this is, unfortunately. For security reasons and for organizational simplicity for TA Aki, we would ask that you send a Google mail gmail address.