In this capstone project course, we'll compare genome sequences of COVID-19 mutations to identify potential areas a drug therapy can look to target. The first step in drug discovery involves identifying target subsequences of theirs genome to target. We'll start by comparing the genomes of virus mutations to look for similarities. Then, we'll perform PCA to cut down our number of dimensions and identify the most common features. Next, we'll use K-means clustering in Python to find the optimal number of groups and trace the lineage of the virus. Finally, we'll predict similarity between the sequences and use this to pick a target subsequence. Throughout the course, each section will consist of a programming assignment coupled with a guide video and helpful hints. By the end, you'll be well on your way to discovering ways to combat disease with genome sequencing.
Offered By

About this Course
2,464 recent views
Flexible deadlines
Reset deadlines in accordance to your schedule.
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Course 4 of 4 in the
Intermediate Level
We recoomend that you take the other two courses in the specizliation (or are familiar with the content) before attempting this capstone project.
Approx. 12 hours to complete
English
What you will learn
Analyzing genome sequences to find similarities and identify target subsequences using predctive models.
Skills you will gain
- Whole Genome Sequencing
- Machine Learning
- Drug Discovery
- Dimensionality Reduction
- K-Means Clustering
Flexible deadlines
Reset deadlines in accordance to your schedule.
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Course 4 of 4 in the
Intermediate Level
We recoomend that you take the other two courses in the specizliation (or are familiar with the content) before attempting this capstone project.
Approx. 12 hours to complete
English
Offered by
Syllabus - What you will learn from this course
3 hours to complete
Comparing Genome Sequences
3 hours to complete
4 videos (Total 7 min)
3 hours to complete
Principal Component Analysis on Genome Sequences
3 hours to complete
2 videos (Total 3 min), 1 reading, 1 quiz
3 hours to complete
Feature Analysis using K-Means Clustering
3 hours to complete
2 videos (Total 2 min)
3 hours to complete
Predicting Bit Score to Find Sequence Matches
3 hours to complete
2 videos (Total 3 min)
About the AI for Scientific Research Specialization

Frequently Asked Questions
When will I have access to the lectures and assignments?
What will I get if I subscribe to this Specialization?
Is financial aid available?
More questions? Visit the Learner Help Center.