This course delves into advanced data structures in Python, focusing on the powerful capabilities of the NumPy and Pandas libraries. It introduces the ndarray, a multidimensional array object provided by NumPy, enabling efficient storage and manipulation of large datasets. Additionally, learners will explore the Series and DataFrame structures offered by Pandas, which facilitate data analysis and manipulation in a more user-friendly manner. Throughout the course, students will engage in practical exercises and case studies to reinforce their understanding of how these advanced data structures can be applied in real-world scenarios.



BiteSize Python: NumPy and Pandas
This course is part of BiteSize Python for Intermediate Learners Specialization

Instructor: Di Wu
Access provided by Universidad EAFIT
Recommended experience
What you'll learn
- Understanding and utilizing the ndarray from the NumPy library. 
- Exploring the Series and DataFrame structures in the Pandas library. 
- Practical applications of advanced data structures in data analysis and manipulation. 
Skills you'll gain
Details to know

Add to your LinkedIn profile
5 assignments
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 5 modules in this course
This module introduces the ndarray, the core data structure of the NumPy library that allows for efficient manipulation of large, multi-dimensional arrays. It begins with an overview of what an ndarray is and compares its capabilities to Python's built-in list data structure. The module then covers how to create ndarray objects, access and manipulate both 1D and 2D arrays, and perform various operations on these arrays. By the end of this module, learners will gain a solid understanding of how to effectively use ndarray for numerical and data analysis tasks.
What's included
6 readings1 assignment6 ungraded labs
This module delves deeper into the NumPy library, focusing on its powerful features and functionalities. It covers universal functions (ufuncs) that allow for element-wise operations on ndarray, enabling efficient computation across large datasets. The module also explores various statistical methods available in NumPy, linear algebra operations for solving mathematical problems, random number generation for simulations and modeling, and masking techniques for filtering data. By the end of this module, learners will be equipped with the skills to leverage NumPy's capabilities for advanced numerical analysis.
What's included
1 reading1 assignment5 ungraded labs
This module introduces the Series data structure in Pandas, which is a one-dimensional labeled array capable of holding any data type. It begins by defining what a Series is and its significance in data analysis. The module covers various methods to create a Series, including using lists, dictionaries, and NumPy arrays. Learners will also explore how to access and manipulate elements within a Series, as well as perform mathematical operations on Series data. By the end of this module, students will understand how to utilize Series for effective data manipulation and analysis.
What's included
2 readings1 assignment3 ungraded labs
This module introduces the DataFrame data structure in Pandas, which is a two-dimensional labeled data structure that can hold heterogeneous data types. The module begins by defining what a DataFrame is and its significance in data analysis and manipulation. Learners will explore various methods to create DataFrames from sources such as dictionaries, lists, and external files (e.g., CSV). The module covers how to access data within a DataFrame using labels and indices, manipulate rows and columns, and perform operations such as merging and concatenating multiple DataFrames. By the end of this module, students will be proficient in utilizing DataFrames for data manipulation tasks.
What's included
2 readings1 assignment7 ungraded labs
This module provides an in-depth exploration of the Pandas library, which is essential for data manipulation and analysis in Python. It starts with an overview of what Pandas is and its significance in data science. The module highlights useful functionalities within Pandas, including data loading, cleaning, and preparation. Learners will examine how to generate descriptive statistics for both numerical and categorical columns, use the groupby() method for data aggregation, and handle missing and duplicate values effectively. By the end of this module, students will have a solid understanding of how to leverage Pandas for comprehensive data analysis.
What's included
2 readings1 assignment6 ungraded labs
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor

Offered by
Why people choose Coursera for their career




Explore more from Data Science
 - Coursera Project Network 
 - University of Michigan 
 - University of Colorado Boulder 


