An introduction to data integration and statistical methods used in contemporary Systems Biology, Bioinformatics and Systems Pharmacology research.
Module 2 - Data Processing and Identifying Differentially Expressed Genes:
This set of lectures first discuss data normalization methods, and then several lectures are devoted to explaining the problem of identifying differentially expressed genes with the focus on understanding the inner workings of a new method developed by the Ma'ayan Lab called the Characteristic Direction.
Module 3 - Gene List Enrichment Analyses:
In this module the emphasis is on tools developed by the Ma'ayan Lab to analyze gene sets. Several tools will be discussed including: Enrichr, GEO2Enrichr, Expression2Kinases and DrugPairSeeker. In addition, one lecture will be devoted to a method we call enrichment vector clustering we developed, and two lectures will describe the popular GSEA method.
Module 4 - Deep Sequencing Data Analysis:
A set of lectures will cover the basic steps and popular pipelines to analyze RNA-seq and ChIP-seq data going from the raw data to gene lists to figures. These lectures also cover UNIX/Linux commands and some programming elements of R, a popular freely available statistical software.
Module 5 - PCA, Hierarchical Clustering, Self-Organizing Maps, and Network-based Clustering:
This module is devoted to various method of clustering: principle components analysis, self-organizing maps, network-based clustering and hierarchical clustering. The theory behind these methods of analysis are covered in detail, and this is followed by some practical demonstration of the methods for applications using R and MATLAB.
Module 6 - Resources for Data Integration:
Next are lectures about the various types of networks that are typically constructed and analyzed in systems biology and systems pharmacology. These lectures start with the idea of functional association networks (FANs). Following this lecture are several lectures that discuss how to construct FANs from various resources and how to use these networks for analyzing gene lists as well as to construct a puzzle that can be used to connect genomic data with phenotypic data.
Module 7 - Crowdsourcing:
The final set of lectures presents the idea of crowsourcing. In Coursera we have the opportunity to work together on projects that are difficult to complete alone (microtasks) or compete by thinking and implementing algorithms for solving hard problems (megatasks). You will have the opportunity to participate in three crowdsourcing projects: one microtask and two megatasks. These are projects we designed specifically for this course.
Basic courses in statistics and molecular biology are useful but not required. Familiarity with environments such as R and MATLAB can be
useful but not necessary.
Review articles and selected original research articles will be discussed in the lectures and can enhance understanding, but these are not required to complete the course. All materials will be from open access journals or will be provided as links to e-reprints, so there will be no cost to the student.
The class will consist of lecture videos, which are between 8 and 15 minutes in length. Each lecture will include a quiz and a homework assignment.For evaluation, students will be mainly graded through their participation in the assignments and quiz completion.
Yes. Students who successfully complete the course will receive a Statement of Accomplishment signed by the Course Director.
The course is designed to accommodate students from diverse backgrounds. Specifically, background in molecular biology, statistics, and computer programming is most helpful, but such background is not assumed or required.
The class can be easy if the student is only concerned with playing a relatively passive role. However, students are encouraged to engage in the course and take initiative and exercise their creativity. This may require more time and effort but would be more fun and rewarding.