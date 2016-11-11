Important note: The second assignment in this course covers the topic of Graph Analysis in the Cloud, in which you will use Elastic MapReduce and the Pig language to perform graph analysis over a moderately large dataset, about 600GB. In order to complete this assignment, you will need to make use of Amazon Web Services (AWS). Amazon has generously offered to provide up to $50 in free AWS credit to each learner in this course to allow you to complete the assignment. Further details regarding the process of receiving this credit are available in the welcome message for the course, as well as in the assignment itself. Please note that Amazon, University of Washington, and Coursera cannot reimburse you for any charges if you exhaust your credit.
This course is part of the Data Science at Scale Specialization
Offered By
About this Course
Offered by
University of Washington
Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world.
Syllabus - What you will learn from this course
Visualization
Statistical inferences from large, heterogeneous, and noisy datasets are useless if you can't communicate them to your colleagues, your customers, your management and other stakeholders. Learn the fundamental concepts behind information visualization, an increasingly critical field of research and increasingly important skillset for data scientists. This module is taught by Cecilia Aragon, faculty in the Human Centered Design and Engineering Department.
Privacy and Ethics
Big Data has become closely linked to issues of privacy and ethics: As the limits on what we *can* do with data continue to evaporate, the question of what we *should* do with data becomes paramount. Motivated in the context of case studies, you will learn the core principles of codes of conduct for data science and statistical analysis. You will learn the limits of current theory on protecting privacy while still permitting useful statistical analysis.
Reproducibility and Cloud Computing
Science is facing a credibility crisis due to unreliable reproducibility, and as research becomes increasingly computational, the problem seems to be paradoxically getting worse. But reproducibility is not just for academics: Data scientists who cannot share, explain, and defend their methods for others to build on are dangerous. In this module, you will explore the importance of reproducible research and how cloud computing is offering new mechanisms for sharing code, data, environments, and even costs that are critical for practical reproducibility.
Reviews
- 5 stars35.55%
- 4 stars23.70%
- 3 stars17.03%
- 2 stars8.14%
- 1 star15.55%
TOP REVIEWS FROM COMMUNICATING DATA SCIENCE RESULTS
The information from the last assignment is split into Forums and Tasks description. This is very easy to fix and not doing it shows passivity from the organizers
Great and useful first week about visualization, although I wish it would cover more material . The ethics and cloud computing felt somewhat incomplete, but useful as well.
Too little people participated and long peer review time. But the course content is good.
About the Data Science at Scale Specialization
Learn scalable data management, evaluate big data technologies, and design effective visualizations.
Frequently Asked Questions
When will I have access to the lectures and assignments?
What will I get if I subscribe to this Specialization?
Is financial aid available?
More questions? Visit the Learner Help Center.