University of Colorado Boulder
BiteSize Python: NumPy and Pandas
University of Colorado Boulder

BiteSize Python: NumPy and Pandas

Di Wu

Instructor: Di Wu

Access provided by Justice Through Code at Columbia University

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

1 week to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Understanding and utilizing the ndarray from the NumPy library.

  • Exploring the Series and DataFrame structures in the Pandas library.

  • Practical applications of advanced data structures in data analysis and manipulation.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

5 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the BiteSize Python for Intermediate Learners Specialization
When you enroll in this course, you'll also be enrolled in this Specialization.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 5 modules in this course

This module introduces the ndarray, the core data structure of the NumPy library that allows for efficient manipulation of large, multi-dimensional arrays. It begins with an overview of what an ndarray is and compares its capabilities to Python's built-in list data structure. The module then covers how to create ndarray objects, access and manipulate both 1D and 2D arrays, and perform various operations on these arrays. By the end of this module, learners will gain a solid understanding of how to effectively use ndarray for numerical and data analysis tasks.

What's included

6 readings1 assignment6 ungraded labs

This module delves deeper into the NumPy library, focusing on its powerful features and functionalities. It covers universal functions (ufuncs) that allow for element-wise operations on ndarray, enabling efficient computation across large datasets. The module also explores various statistical methods available in NumPy, linear algebra operations for solving mathematical problems, random number generation for simulations and modeling, and masking techniques for filtering data. By the end of this module, learners will be equipped with the skills to leverage NumPy's capabilities for advanced numerical analysis.

What's included

1 reading1 assignment5 ungraded labs

This module introduces the Series data structure in Pandas, which is a one-dimensional labeled array capable of holding any data type. It begins by defining what a Series is and its significance in data analysis. The module covers various methods to create a Series, including using lists, dictionaries, and NumPy arrays. Learners will also explore how to access and manipulate elements within a Series, as well as perform mathematical operations on Series data. By the end of this module, students will understand how to utilize Series for effective data manipulation and analysis.

What's included

2 readings1 assignment3 ungraded labs

This module introduces the DataFrame data structure in Pandas, which is a two-dimensional labeled data structure that can hold heterogeneous data types. The module begins by defining what a DataFrame is and its significance in data analysis and manipulation. Learners will explore various methods to create DataFrames from sources such as dictionaries, lists, and external files (e.g., CSV). The module covers how to access data within a DataFrame using labels and indices, manipulate rows and columns, and perform operations such as merging and concatenating multiple DataFrames. By the end of this module, students will be proficient in utilizing DataFrames for data manipulation tasks.

What's included

2 readings1 assignment7 ungraded labs

This module provides an in-depth exploration of the Pandas library, which is essential for data manipulation and analysis in Python. It starts with an overview of what Pandas is and its significance in data science. The module highlights useful functionalities within Pandas, including data loading, cleaning, and preparation. Learners will examine how to generate descriptive statistics for both numerical and categorical columns, use the groupby() method for data aggregation, and handle missing and duplicate values effectively. By the end of this module, students will have a solid understanding of how to leverage Pandas for comprehensive data analysis.

What's included

2 readings1 assignment6 ungraded labs

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Di Wu
University of Colorado Boulder
21 Courses54,225 learners

Offered by

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Explore more from Data Science