The "Data Collection and Integration" course provides students with comprehensive techniques for gathering data from diverse sources, including files, relational databases, web pages, and APIs. Participants will gain practical experience in collecting and integrating data for further processing and analysis. The course emphasizes the utilization of appropriate tools and packages, such as Pandas, Beautiful Soup, and SQL, to effectively handle real-life datasets and address data integration challenges.



Data Collection and Integration
This course is part of Data Wrangling with Python Specialization

Instructor: Di Wu
Access provided by Duke University
(14 reviews)
Recommended experience
What you'll learn
- How to utilize Python and Python packages to collect data from various sources 
- How to integrate data collected from various sources to a unified dataset for further processing and analysis 
Skills you'll gain
Details to know

Add to your LinkedIn profile
6 assignments
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 6 modules in this course
The "Collect Data from Files" week focuses on equipping you with the necessary skills to handle various file formats, such as txt, csv, json, xml, html, and more, for effective data collection. You will learn how to read, parse, and extract relevant data from different file types, enabling you to gather valuable information from diverse sources.
What's included
2 videos3 readings1 assignment1 discussion prompt
The "Collect Data from Web" week focuses on empowering you with the skills to extract data from various webpage formats using Python libraries like requests and Beautiful Soup. You will learn how to access web pages, retrieve HTML content, and parse the data to collect relevant information effectively.
What's included
1 video2 readings1 assignment1 discussion prompt
The "Collect Data from Database" week focuses on equipping you with the skills to interact with various SQL-like databases using Python packages. You will learn how to connect to databases, execute queries, and retrieve data from different database systems, enabling you to collect and utilize data efficiently.
What's included
1 video2 readings1 assignment1 discussion prompt
The "Collect Data from APIs" week focuses on enabling you to interact with various websites that provide Application Programming Interfaces (APIs). You will learn how to access APIs, retrieve data in structured formats (e.g., JSON or XML), and utilize Python to process and extract valuable information from API responses.
What's included
1 video1 reading1 assignment1 discussion prompt
The "Data Integration" week focuses on the techniques and methodologies for integrating data collected from various sources. You will learn how to combine and merge datasets, handle data inconsistencies, and create a unified dataset for further analysis and decision-making.
What's included
1 video2 readings1 assignment1 discussion prompt
The "Case Study" week offers you the opportunity to apply the knowledge you have learned throughout the course in a practical and comprehensive case study. You will engage in data collection from various sources, including files, SQL-like databases, and web APIs, and then integrate the collected data into a unified dataset for further analysis. This week serves as a culminating activity, allowing you to demonstrate your skills in data collection, integration, and preparation for analysis.
What's included
1 reading1 assignment
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor

Offered by
Why people choose Coursera for their career




Learner reviews
14 reviews
- 5 stars86.66% 
- 4 stars0% 
- 3 stars6.66% 
- 2 stars0% 
- 1 star6.66% 
Showing 3 of 14
Reviewed on Dec 6, 2023
Great course, and easy to follow along to learn the material. Great exercises and practice.
Explore more from Data Science
 - University of Maryland, College Park 
 - University of Michigan 
 - University of Colorado System 


