This course distills for you expert knowledge and skills mastered by professionals in Health Big Data Science and Bioinformatics. You will learn exciting facts about the human body biology and chemistry, genetics, and medicine that will be intertwined with the science of Big Data and skills to harness the avalanche of data openly available at your fingertips and which we are just starting to make sense of. We’ll investigate the different steps required to master Big Data analytics on real datasets, including Next Generation Sequencing data, in a healthcare and biological context, from preparing data for analysis to completing the analysis, interpreting the results, visualizing them, and sharing the results.
Needless to say, when you master these high-demand skills, you will be well positioned to apply for or move to positions in biomedical data analytics and bioinformatics. No matter what your skill levels are in biomedical or technical areas, you will gain highly valuable new or sharpened skills that will make you stand-out as a professional and want to dive even deeper in biomedical Big Data. It is my hope that this course will spark your interest in the vast possibilities offered by publicly available Big Data to better understand, prevent, and treat diseases.
After this module, you will be able to
1. Locate and download files for data analysis involving genes and medicine.
2. Open files and preprocess data using R language.
3. Write R scripts to replace missing values, normalize data, discretize data, and sample data.
Working with cBioPortal - Genetic Data Analysis•9 minutes
Working with cBioPortal - Gene Networks•9 minutes
2 readings•Total 20 minutes
Module 1 cBioPortal Data Analytics•10 minutes
Module 1 Resources•10 minutes
6 assignments•Total 180 minutes
DNA, RNA, Genes, and Proteins•30 minutes
Transcription and Translation Processes•30 minutes
Data, Variables, and Big Datasets•30 minutes
Working with cBioPortal•30 minutes
Module 1 Quiz•30 minutes
Module 1 cBioPortal Data Analytics•30 minutes
1 discussion prompt•Total 10 minutes
Module 1 Discussion•10 minutes
Preparing Datasets for Analysis
Module 2•8 hours to complete
Module details
After this module, you will be able to:
1. Locate and download files for data analysis involving genes and medicine.
2. Open files and preprocess data using R language.
3. Write R scripts to replace missing values, normalize data, discretize data, and sample data.
After this module, you will be able to
1. Select features from highly dimensional datasets.
2. Evaluate the performance of feature selection methods.
3. Write R scripts to select features from datasets involving gene expressions.
Module 3 R Finding Differentially Expressed Genes•10 minutes
Module 3 Resources•10 minutes
6 assignments•Total 180 minutes
Feature Selection Methods•30 minutes
Evaluation Schemes•30 minutes
Differentially Expressed Genes•30 minutes
Heatmaps•30 minutes
Module 3 Quiz•30 minutes
Module 3 R Finding Differentially Expressed Genes•30 minutes
1 discussion prompt•Total 10 minutes
Module 3 Discussion•10 minutes
2 ungraded labs•Total 120 minutes
Module 3 Notebook•60 minutes
Module 3 Notebook•60 minutes
Predicting Diseases from Genes
Module 4•8 hours to complete
Module details
After this module, you will be able to
1. Build classification and prediction models.
2. Evaluate the performance of classification and prediction methods.
3. Write R scripts to classify and predict diseases from gene expressions.
Overview of Classification and Prediction Methods•9 minutes
Classification Methods Based on Analogy•12 minutes
Classification Methods Based on Rules•13 minutes
Classification Methods Based on Neural Networks•7 minutes
Classification Methods Based on Statistics•4 minutes
Classification Methods Based on Probabilities•8 minutes
Prediction Methods•4 minutes
Evaluation Schemes•14 minutes
Prediction Workflow•4 minutes
R Scripts for Prediction•2 minutes
Jupyter Notebooks 101•7 minutes
4 readings•Total 40 minutes
Jupyter Notebooks Essentials•10 minutes
Notebook Module 4 Tutorial•10 minutes
Module 4 R Predicting Diseases from Genes•10 minutes
Module 4 Resources•10 minutes
10 assignments•Total 300 minutes
Overview•30 minutes
Classification with Analogy•30 minutes
Classification based on Rules•30 minutes
Classification with Neural Networks•30 minutes
Classification based on Statistics•30 minutes
Classification based on Probabilities•30 minutes
Prediction Models•30 minutes
Evaluation Schemes•30 minutes
Module 4 Quiz•30 minutes
Module 4 R Predicting Diseases from Genes•30 minutes
1 discussion prompt•Total 10 minutes
Module 4 Discussion•10 minutes
1 ungraded lab•Total 60 minutes
Module 4 Notebook•60 minutes
Determining Gene Alterations
Module 5•7 hours to complete
Module details
After this module, you will be able to
1. List different types of gene alterations.
2. Compare and contrast methods for detecting gene mutations.
3. Compare and contrast methods for detecting methylation.
4. Compare and contrast methods for detecting copy number variations.
5. Quantify genomic alterations.
6. Connect genomic alterations to differential expression of genes.
7. Write programs in R for determining gene alterations and their relationship with gene expression.
Genomic Alterations and Gene Expressions•17 minutes
R Scripts for Gene Alterations•2 minutes
Jupyter Notebooks 101•7 minutes
4 readings•Total 40 minutes
Notebook Module 5 Tutorial•10 minutes
Jupyter Notebooks Essentials•10 minutes
Module 5 R Gene Alterations•10 minutes
Module 5 Resources•10 minutes
8 assignments•Total 230 minutes
Gene Alterations•30 minutes
Gene Mutations•30 minutes
Methylation•30 minutes
Copy Number Alterations•30 minutes
Genomic Alterations and Gene Expressions•30 minutes
Module 5 Quiz (Temporary)•30 minutes
Module 5 Quiz•20 minutes
Module 5 R Gene Alterations•30 minutes
1 discussion prompt•Total 10 minutes
Module 5 Discussion•10 minutes
1 ungraded lab•Total 60 minutes
Module 5 Notebook•60 minutes
Clustering and Pathway Analysis
Module 6•6 hours to complete
Module details
After this module, you will be able to 1. Find clusters in biomedical data involving genes.2. Analyze and visualize biological pathways. 3. Write R scripts for clustering and for pathway analysis.
The State University of New York, with 64 unique institutions, is the largest comprehensive system of higher education in the United States. Educating nearly 468,000 students in more than 7,500 degree and certificate programs both on campus and online, SUNY has nearly 3 million alumni around the globe.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.