Multicore and GPGPU Programming

Sichern Sie sich eines unserer besten Angebote mit Coursera Plus für 199 $ (normalerweise 399 $). Jetzt sparen.

kurs ist nicht verfügbar in Deutsch (Deutschland)

Wir übersetzen es in weitere Sprachen.

Multicore and GPGPU Programming

Dozenten: Kunal Kishore Korgaonkar

Bei enthalten

Mehr erfahren

Fragen Sie Coursera

12 Module

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.

Stufe Mittel

Empfohlene Erfahrung

8 Wochen zu vervollständigen

unter 10 Stunden pro Woche

Flexibler Zeitplan

In Ihrem eigenen Lerntempo lernen

12 Module

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.

Stufe Mittel

Empfohlene Erfahrung

8 Wochen zu vervollständigen

unter 10 Stunden pro Woche

Flexibler Zeitplan

In Ihrem eigenen Lerntempo lernen

Was Sie lernen werden

Understand the fundamentals of multi-threaded programming and its applications in multicore systems.
Develop shared memory programs in OpenMP and distributed programming using MPI.
Gain a foundational understanding of GPGPU architecture and the CUDA programming model.

Kompetenzen, die Sie erwerben

Kategorie: Microarchitecture
Kategorie: Program Development
Kategorie: Distributed Computing
Kategorie: Algorithms
Kategorie: Memory Management
Kategorie: Performance Testing

Werkzeuge, die Sie lernen werden

Kategorie: C (Programming Language)

Wichtige Details

Zertifikat zur Vorlage

Zu Ihrem LinkedIn-Profil hinzufügen

Bewertungen

124 Aufgaben

Unterrichtet in Englisch

Erfahren Sie, wie Mitarbeiter führender Unternehmen gefragte Kompetenzen erwerben.

Weitere Informationen zu Coursera für Unternehmen

Logos von Petrobras, TATA, Danone, Capgemini, P&G und L'Oreal

In diesem Kurs gibt es 12 Module

The course "Multicore and GPGPU Programming" provides a foundational understanding of parallel programming, focusing on developing high-performance, multi-threaded applications in both CPU and GPU environments. Beginning with a review of multicore processor architectures, caching mechanisms, and Non-Uniform Memory Access (NUMA) systems, students will learn the essentials of shared memory programming, synchronisation techniques, and the use of locks to ensure data integrity across threads.

The course delves into designing shared memory data structures and introduces advanced synchronisation concepts, including lazy synchronisation, crucial for scalable and efficient concurrent applications. Additionally, students will explore the architecture and programming model of General-Purpose Graphics Processing Units (GPGPUs) and learn CUDA programming to leverage GPU parallelism for compute-intensive tasks. By the end of the course, students will be adept in optimising multi-threaded and many-core applications, balancing workload across CPUs and GPUs to achieve high throughput and efficient resource utilisation. This course is essential for those aiming to develop expertise in high-performance computing and parallel programming for modern multi-core and GPU-based systems.

In this module, the learners will be introduced to the course and its syllabus, setting the foundation for their learning journey. The course's introductory video will provide them with insights into the valuable skills and knowledge they can expect to gain throughout the duration of this course. Additionally, the syllabus reading will comprehensively outline essential course components, including course values, assessment criteria, grading system, schedule, details of live sessions, and a recommended reading list that will enhance the learner’s understanding of the course concepts. Moreover, this module offers the learners the opportunity to connect with fellow learners as they participate in a discussion prompt designed to facilitate introductions and exchanges within the course community.

Das ist alles enthalten

4 Videos1 Lektüre1 Diskussionsthema

4 VideosInsgesamt 51 Minuten

Course Introductory Video2 Minuten
Meet Your Instructor - Dr. Gargi Prabhu 1 Minute
Meet Your Instructor - Dr. Kunal Korgaonkar1 Minute
Recording of Multicore and GPGPU Programming: Week 1 - Live Session on 25-05-23 18:32:50 [47:25]47 Minuten

1 LektüreInsgesamt 10 Minuten

Course Overview10 Minuten

1 DiskussionsthemaInsgesamt 10 Minuten

Meet Your Peers10 Minuten

In this module, students will gain foundational knowledge of parallel and multi-threaded programming, exploring the core principles that underlie the efficient utilisation of modern multi-core and many-core processors. Beginning with an overview of parallel programming concepts, this module covers different types of parallelism, including data parallelism, task parallelism, and pipeline parallelism. Students will also examine critical performance metrics like speedup, efficiency, and scalability, which help in evaluating the benefits and trade-offs of parallel approaches.

Das ist alles enthalten

12 Videos2 Lektüren12 Aufgaben1 Diskussionsthema

12 VideosInsgesamt 73 Minuten

Need for Ever-Increasing Performance8 Minuten
Parallel Systems and Parallel Programs8 Minuten
Concurrent, Parallel, Distributed Systems5 Minuten
Types of Parallelism: Data, Task and Pipeline Parallelism8 Minuten
Speedup and Efficiency5 Minuten
Amdahl’s Law 5 Minuten
Gustafson’s Law 5 Minuten
Scalability in Parallel Systems5 Minuten
Cost of Parallelisation7 Minuten
Sources of Overhead in Parallel Programs 5 Minuten
Timing Parallel Programs: Methods and Best Practices7 Minuten
GPU Performance5 Minuten

2 LektürenInsgesamt 120 Minuten

Recommended Reading: Fundamentals of Parallel Computing60 Minuten
Recommended Reading: Introduction to Performance Metrics in Parallel Computing60 Minuten

12 AufgabenInsgesamt 36 Minuten

Need for Ever-Increasing Performance3 Minuten
Parallel Systems and Parallel Programs3 Minuten
Concurrent, Parallel, Distributed Systems3 Minuten
Types of Parallelism: Data, Task and Pipeline Parallelism3 Minuten
Speedup and Efficiency3 Minuten
Amdahl’s Law 3 Minuten
Gustafson’s Law 3 Minuten
Scalability in MIMD Systems3 Minuten
Cost of Parallelisation3 Minuten
Sources of Overhead in Parallel Programs3 Minuten
Taking Timings of Parallel Programs3 Minuten
GPU Performance3 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

Why Parallelism? Revisiting the Roots of Multicore Programming30 Minuten

This module provides an in-depth exploration of multicore processor architectures, examining the design principles, performance considerations, and challenges involved in building efficient multicore systems. Students will study how multiple cores interact within a processor, focusing on memory hierarchies, caching mechanisms, and the role of parallelism in improving computational performance.

Das ist alles enthalten

15 Videos2 Lektüren15 Aufgaben1 Diskussionsthema

15 VideosInsgesamt 160 Minuten

The Von Neumann Architecture7 Minuten
Processes, Multitasking, and Threads5 Minuten
The Basics of Caching7 Minuten
Virtual Memory7 Minuten
Instruction-Level Parallelism9 Minuten
Hardware Multithreading6 Minuten
Classifications of Parallel Computers6 Minuten
SIMD and MIMD Systems7 Minuten
Interconnection Networks: Shared Memory Systems6 Minuten
Interconnection Networks: Distributed Memory Systems8 Minuten
Cache Coherence8 Minuten
Shared-Memory vs. Distributed-Memory4 Minuten
Parallel Software: Coordinating Process and Threads11 Minuten
Distributed Memory Software7 Minuten
Recording of Multicore and GPGPU Programming: Week 2 - Live Session on 25-05-30 18:35:08 [02:05]62 Minuten

2 LektürenInsgesamt 100 Minuten

Recommended Reading: Architecture Background40 Minuten
Recommended Reading: Parallel Hardware and Software60 Minuten

15 AufgabenInsgesamt 114 Minuten

Graded Quiz - Modules 1 and 2 60 Minuten
The Von Neumann Architecture3 Minuten
Processes, Multitasking, and Threads3 Minuten
The Basics of Caching3 Minuten
Virtual Memory3 Minuten
Instruction-Level Parallelism3 Minuten
Hardware Multithreading3 Minuten
Classifications of Parallel Computer3 Minuten
SIMD and MIMD Systems3 Minuten
Interconnection Networks: Shared Memory Systems3 Minuten
Interconnection Networks: Distributed Memory Systems6 Minuten
Cache Coherence3 Minuten
Shared-Memory vs. Distributed-Memory3 Minuten
Parallel Software: Coordinating Process and Threads12 Minuten
Distributed Memory Software3 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

From Von Neumann to Multicore: Evolving Architectures and Memory Realities30 Minuten

This module introduces students to the architectural principles of General-Purpose GPU (GPGPU) systems and the CUDA programming model. It explores the hardware components, including Streaming Multiprocessors (SMs), CUDA cores, and memory hierarchy, which form the foundation of GPU computing. The module also provides an overview of the CUDA programming model, emphasising its thread hierarchy, grid, and block organisation. By understanding these fundamental concepts, students will develop the ability to harness GPU architecture for high-performance parallel computing.

Das ist alles enthalten

15 Videos2 Lektüren14 Aufgaben1 Diskussionsthema

15 VideosInsgesamt 127 Minuten

GPUs and GPGPU5 Minuten
GPU Architecture5 Minuten
Heterogeneous Computing4 Minuten
Paradigm of Heterogeneous Computing5 Minuten
Introduction to CUDA5 Minuten
Structure of a CUDA Program8 Minuten
Threads, Blocks, and Grid9 Minuten
Managing Memory7 Minuten
Writing and Verifying Your Kernel6 Minuten
Compiling and Running CUDA Program4 Minuten
Nvidia Compute Capabilities and Device Architecture6 Minuten
Timing Your Kernel7 Minuten
Organising Parallel Threads5 Minuten
Managing Devices4 Minuten
Recording of Multicore and GPGPU Programming: Week 3 - Live Session on 25-06-06 18:31:21 [44:50]45 Minuten

2 LektürenInsgesamt 75 Minuten

Recommended Reading: GPGPU Architecture and CUDA15 Minuten
Recommended Reading: Programming Model Overview60 Minuten

14 AufgabenInsgesamt 48 Minuten

GPUs and GPGPU6 Minuten
GPU Architecture3 Minuten
Heterogeneous Computing3 Minuten
Paradigm of Heterogeneous Computing3 Minuten
Introduction to CUDA3 Minuten
Structure of a CUDA Program3 Minuten
Threads, Blocks, and Grid6 Minuten
Managing Memory3 Minuten
Writing and Verifying Your Kernel3 Minuten
Compiling and Running CUDA Program3 Minuten
Nvidia Compute Capabilities and Device Architecture3 Minuten
Timing Your Kernel3 Minuten
Organising Parallel Threads3 Minuten
Managing Devices3 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

Harnessing GPU Power: Exploring CUDA and the Architecture of Parallelism30 Minuten

This module provides a comprehensive understanding of how CUDA executes programs on GPUs. It covers key concepts such as warps, warp scheduling, and resource partitioning, which are critical for understanding GPU hardware behaviour. The module delves into branch divergence and its impact on performance, offering strategies to minimise its effects. It also emphasises exposing parallelism effectively by leveraging CUDA’s hierarchical execution model. Students will learn how to design and optimise GPU programs by aligning with the underlying execution model to maximise efficiency and throughput.

Das ist alles enthalten

15 Videos2 Lektüren15 Aufgaben1 Diskussionsthema

15 VideosInsgesamt 135 Minuten

Introduction to CUDA Execution Model7 Minuten
Warps and Thread Blocks4 Minuten
Warp Divergence9 Minuten
Resource Partitioning6 Minuten
Latency Hiding10 Minuten
Occupancy5 Minuten
Synchronization4 Minuten
Scalability5 Minuten
Exposing Parallelism10 Minuten
Checking Active Warps with Nvprof6 Minuten
Checking Memory Operations with Nvprof7 Minuten
Avoiding Branch Divergence3 Minuten
The Parallel Reduction Problem and Thread Divergence7 Minuten
Improving Divergence in Parallel Reduction6 Minuten
Recording of Multicore and GPGPU Programming: Week 4 - Live Session on 25-06-13 18:32:39 [49:37]45 Minuten

2 LektürenInsgesamt 120 Minuten

Recommended Reading: Structure of a CUDA Program60 Minuten
Recommended Reading: Exposing Parallelism and Avoiding Branch Divergence60 Minuten

15 AufgabenInsgesamt 105 Minuten

Graded Quiz - Modules 3 and 4 60 Minuten
Introduction to CUDA Execution Model3 Minuten
Warps and Thread Blocks 3 Minuten
Warp Divergence3 Minuten
Resource Partitioning6 Minuten
Latency Hiding3 Minuten
Occupancy3 Minuten
Synchronization3 Minuten
Scalability3 Minuten
Exposing Parallelism3 Minuten
Checking Active Warps with Nvprof3 Minuten
Checking Memory Operations with Nvprof3 Minuten
Avoiding Branch Divergence3 Minuten
The Parallel Reduction Problem and Thread Divergence3 Minuten
Improving Divergence in Parallel Reduction3 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

Under the Hood: Warps, Divergence, and CUDA Execution Dynamics30 Minuten

The CUDA Memory Model & Streams and Concurrency module introduces students to the intricacies of memory hierarchy in CUDA, including global, shared, and local memory. It emphasises the importance of memory coalescing and efficient memory access patterns to optimise performance on GPUs. The module also covers CUDA streams, explaining how concurrent kernel execution and memory operations can be managed to enhance parallelism. By understanding these concepts, students will gain the ability to design GPU programs that maximise throughput and minimise latency.

Das ist alles enthalten

14 Videos2 Lektüren14 Aufgaben1 Diskussionsthema1 Unbewertetes Labor

14 VideosInsgesamt 126 Minuten

Introduction to CUDA Memory Model8 Minuten
Memory Allocation and Deallocation6 Minuten
Zero Copy Memory4 Minuten
Unified Virtual Addressing and Unified Memory 3 Minuten
Aligned and Coalesced Access6 Minuten
CUDA Shared Memory6 Minuten
Shared Memory Banks and Access Mode 7 Minuten
Configuring the Amount of Shared Memory5 Minuten
Synchronisation9 Minuten
CUDA Streams7 Minuten
Stream Scheduling and Priorities6 Minuten
CUDA Events6 Minuten
Concurrent Kernel Execution6 Minuten
Recording of Multicore and GPGPU Programming: Week 5 - Live Session on 25-06-20 18:31:59 [47:36]48 Minuten

2 LektürenInsgesamt 120 Minuten

Recommended Reading: CUDA Memory Model60 Minuten
Recommended Reading: Streams and Concurrency60 Minuten

14 AufgabenInsgesamt 342 Minuten

SGA-1: CUDA Programming and Performance Optimisation300 Minuten
Introduction to CUDA Memory Model3 Minuten
Memory Allocation and Deallocation3 Minuten
Zero Copy Memory3 Minuten
Unified Virtual Addressing and Unified Memory 3 Minuten
Aligned and Coalesced Access3 Minuten
CUDA Shared Memory6 Minuten
Shared Memory Banks and Access Mode 3 Minuten
Configuring the Amount of Shared Memory3 Minuten
Synchronisation3 Minuten
CUDA Streams3 Minuten
Stream Scheduling and Priorities3 Minuten
CUDA Events3 Minuten
Concurrent Kernel Execution3 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

Smart Memory and Seamless Concurrency: CUDA Memory and Streams30 Minuten

1 Unbewertetes LaborInsgesamt 60 Minuten

Hands on lab: Parallel Matrix Addition Using CUDA60 Minuten

This module explains in depth the difference between processes and threads and introduces multithreaded programming using pthreads library. Students are expected to learn about the various functions in pthreads library and implement those to solve real-world problems through a multithreaded approach. It also discusses precautions to take while developing an algorithm that uses multi-threading.

Das ist alles enthalten

10 Videos11 Lektüren10 Aufgaben1 Diskussionsthema

10 VideosInsgesamt 116 Minuten

Processes, Threads and Pthreads4 Minuten
Hello World!!9 Minuten
Matrix-Vector Multiplication13 Minuten
Critical Sections5 Minuten
Busy Waiting6 Minuten
Mutexes5 Minuten
Semaphores7 Minuten
Barriers and Condition Variables13 Minuten
Caches, Cache-Coherence and False Sharing9 Minuten
Recording of Multicore and GPGPU Programming: Week 6 - Live Session on 25-06-27 18:38:36 [43:53]44 Minuten

11 LektürenInsgesamt 295 Minuten

Recommended Reading: Processes, Threads and Pthreads10 Minuten
Recommended Reading: Hello World!!60 Minuten
Recommended Reading: Matrix-Vector Multiplication15 Minuten
Recommended Reading: Critical Sections30 Minuten
Recommended Reading: Busy Waiting20 Minuten
Recommended Reading: Mutexes15 Minuten
Recommended Reading: Semaphores30 Minuten
Recommended Reading: Barriers and Condition Variables30 Minuten
Recommended Reading: Read-Write Locks60 Minuten
Recommended Reading: Caches, Cache-Coherence and False Sharing15 Minuten
Lab Instruction Document10 Minuten

10 AufgabenInsgesamt 135 Minuten

Graded Quiz - Modules 5 and 6 60 Minuten
Processes, Threads and Pthreads9 Minuten
Hello World!!9 Minuten
Matrix-Vector Multiplication9 Minuten
Critical Sections9 Minuten
Busy Waiting9 Minuten
Mutexes9 Minuten
Semaphores6 Minuten
Barriers and Condition Variables6 Minuten
Caches, Cache-Coherence and False Sharing9 Minuten

1 DiskussionsthemaInsgesamt 10 Minuten

Thread Synchronization and Shared Memory: Building Reliable Parallel Programs with Pthreads10 Minuten

This module aims to introduce students to Distributed memory programming using the Message Passing Interface (MPI). Students will learn about the functions provided by the MPI library and their descriptions. It will enable students to develop parallel programming codes and also to convert a serial programmed code into a parallel code with the help of the MPI functions.

Das ist alles enthalten

7 Videos9 Lektüren7 Aufgaben1 Diskussionsthema

7 VideosInsgesamt 70 Minuten

Introduction to MPI4 Minuten
MPI Setup and Communicator Functions6 Minuten
SPMD and Communication10 Minuten
Potential Pitfalls4 Minuten
Simple Serial Sorting Algorithm20 Minuten
Parallel Odd-Even Transposition Sort19 Minuten
Safety in MPI Programs7 Minuten

9 LektürenInsgesamt 125 Minuten

Recommended Reading: Introduction to MPI15 Minuten
Recommended Reading: MPI Setup and Communicator Functions15 Minuten
Recommended Reading: SPMD and Communication15 Minuten
Recommended Reading: Potential Pitfalls15 Minuten
Recommended Reading: Simple Serial Sorting Algorithm15 Minuten
Recommended Reading: Parallel Odd-Even Transposition Sort15 Minuten
Recommended Reading: Safety in MPI Programs 15 Minuten
Lab: Practice Code10 Minuten
Lab: Practice Solution10 Minuten

7 AufgabenInsgesamt 63 Minuten

Introduction to MPI9 Minuten
MPI Setup and Communicator Functions9 Minuten
SPMD and Communication9 Minuten
Potential Pitfalls9 Minuten
Simple Serial Sorting Algorithm9 Minuten
Parallel Odd-Even Transposition Sort9 Minuten
Safety in MPI Programs9 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

MPI in Action: Understanding Setup, Communication, and Parallel Sorting30 Minuten

This module aims to introduce the shared memory programming model with the help of the OpenMP library. Students will gain exposure to the functions in the OpenMP library and methods to implement those in code to implement parallelism using shared memory. Students will explore the foundational concepts of OpenMP through videos and readings, starting with the basics of the library and progressing to more advanced topics such as reduction clauses, variable scoping, and mutual exclusion. Through worked examples like the Trapezoidal Rule and sorting functions, learners will understand how to parallelise loops, manage scheduling, and apply critical sections and locks for safe concurrent execution. The module also covers tasking in OpenMP and classic concurrency problems like producers and consumers.

Das ist alles enthalten

12 Videos12 Lektüren13 Aufgaben1 Diskussionsthema

12 VideosInsgesamt 94 Minuten

Introduction to OpenMP5 Minuten
Programming in OpenMP10 Minuten
Trapezoidal Rule10 Minuten
Scope of Variables4 Minuten
Reduction Clause7 Minuten
Parallel-For Directive and Caveats in Them8 Minuten
Sorting Functions20 Minuten
Scheduling6 Minuten
Producers and Consumers6 Minuten
Termination, Startup and Atomic Directive7 Minuten
Critical Sections and Locks6 Minuten
Tasking5 Minuten

12 LektürenInsgesamt 152 Minuten

Recommended Reading: Introduction to OpenMP15 Minuten
Recommended Reading: Programming in OpenMP15 Minuten
Recommended Reading: Trapezoidal Rule15 Minuten
Recommended Reading: Scope of Variables15 Minuten
Recommended Reading: Reduction Clause15 Minuten
Recommended Reading: Parallel-For Directive and Caveats in Them15 Minuten
Recommended Reading: Sorting Functions15 Minuten
Recommended Reading: Scheduling 15 Minuten
Recommended Reading: Producers and Consumers15 Minuten
Recommended Reading: Termination, Startup and Atomic Directive1 Minute
Recommended Reading: Critical Sections and Locks1 Minute
Recommended Reading: Tasking15 Minuten

13 AufgabenInsgesamt 168 Minuten

Graded Quiz - Modules 7 and 860 Minuten
Introduction to OpenMP9 Minuten
Programming in OpenMP9 Minuten
Trapezoidal Rule9 Minuten
Scope of Variables9 Minuten
Reduction Clause9 Minuten
Parallel-For Directive and Caveats in Them9 Minuten
Sorting Functions9 Minuten
Scheduling9 Minuten
Producers and Consumers9 Minuten
Termination, Startup and Atomic Directive9 Minuten
Critical Sections and Locks9 Minuten
Tasking9 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

Mastering OpenMP: From Parallel Patterns to Synchronisation30 Minuten

This module will introduce the n-body problem in physics, examining its significance in simulating gravitational interactions among multiple particles. It will explore classical and modern algorithmic approaches to solving the n-body problem, followed by a discussion on their computational complexity. Emphasis will be placed on identifying opportunities for parallelisation, and students will analyse and implement efficient parallel solutions using the programming languages and parallel computing directives covered in the course.

Das ist alles enthalten

13 Videos13 Lektüren13 Aufgaben1 Diskussionsthema

13 VideosInsgesamt 107 Minuten

Introduction to N-body Problem8 Minuten
Serial Solutions to the N-body Problem16 Minuten
Parallelising Strategy13 Minuten
Parallelising Basic Solver Using OpenMP9 Minuten
Parallelising Reduced Solver Using OpenMP 11 Minuten
Evaluating OpenMP Performance5 Minuten
Parallelising Basic Solver Using Pthreads 4 Minuten
Parallelising Basic Solver Using MPI 9 Minuten
Parallelising Reduced Solver Using MPI9 Minuten
Evaluating MPI Performance6 Minuten
Parallelising Basic Solver Using CUDA7 Minuten
Evaluating CUDA Solver and Improving Performance4 Minuten
Using Shared Memory for Solvers7 Minuten

13 LektürenInsgesamt 195 Minuten

Recommended Reading: Introduction to N-body Problem15 Minuten
Recommended Reading: Serial Solutions to the N-body Problem15 Minuten
Recommended Reading: Parallelising Strategy15 Minuten
Recommended Reading: Parallelising Basic Solver Using OpenMP15 Minuten
Recommended Reading: Parallelising Reduced Solver Using OpenMP15 Minuten
Recommended Reading: Evaluating OpenMP performance15 Minuten
Recommended Reading: Parallelising Basic Solver Using Pthreads15 Minuten
Recommended Reading: Parallelising Basic Solver Using MPI15 Minuten
Recommended Reading: Parallelising Reduced Solver Using MPI15 Minuten
Recommended Reading: Evaluating MPI Performance15 Minuten
Recommended Reading: Parallelising Basic Solver Using CUDA15 Minuten
Recommended Reading: Evaluating CUDA Solver and Improving Performance15 Minuten
Recommended Reading: Using Shared Memory for Solvers15 Minuten

13 AufgabenInsgesamt 138 Minuten

Introduction to N-body Problem9 Minuten
Serial Solutions to the N-body Problem9 Minuten
Parallelising Strategy9 Minuten
Parallelising Basic Solver Using OpenMP9 Minuten
Parallelising Reduced Solver Using OpenMP9 Minuten
Evaluating OpenMP Performance9 Minuten
Parallelising Basic Solver Using Pthreads9 Minuten
Parallelising Basic Solver Using MPI30 Minuten
Parallelising Reduced Solver Using MPI9 Minuten
Evaluating MPI Performance9 Minuten
Parallelising Basic Solver Using CUDA9 Minuten
Evaluating CUDA Solver and Improving Performance9 Minuten
Using Shared Memory for Solvers9 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

The N-Body Solver: Exploring Parallelism Across Models30 Minuten

This module focuses on hands-on implementations of the Sample Sort algorithm using OpenMP, Pthreads, MPI, and CUDA. Students will explore the strengths and limitations of each parallel programming model through practical coding exercises. The module includes performance benchmarking and comparative analysis of the implementations to highlight trade-offs in scalability, efficiency, and suitability for different architectures. By the end of the module, students will have a strong grasp of each API and be equipped to make informed decisions about the most appropriate tool for a given parallel computing task.

Das ist alles enthalten

8 Videos9 Lektüren10 Aufgaben1 Diskussionsthema

8 VideosInsgesamt 61 Minuten

Sample Sort and Bucket Sort10 Minuten
Map17 Minuten
Implementing Sample Sort Using OpenMP: First Implementation5 Minuten
Implementing Sample Sort Using OpenMP: Second Implementation7 Minuten
Implementing Sample Sort Using Pthreads 4 Minuten
Implementing Sample Sort Using MPI6 Minuten
Implementing Sample Sort Using MPI: Example5 Minuten
Implementing Sample Sort Using CUDA 7 Minuten

9 LektürenInsgesamt 115 Minuten

Recommended Reading: Sample Sort and Bucket Sort15 Minuten
Recommended Reading: Map10 Minuten
Recommended Reading: Implementing Sample Sort Using OpenMP: First Implementation15 Minuten
Recommended Reading: Implementing Sample Sort Using OpenMP: Second Implementation15 Minuten
Recommended Reading: Implementing Sample Sort Using Pthreads10 Minuten
Recommended Reading: Implementing Sample Sort Using MPI15 Minuten
Recommended Reading: Implementing Sample Sort Using MPI: Example15 Minuten
Recommended Reading: Implementing Sample Sort Using CUDA10 Minuten
Recommended Reading: Which API?10 Minuten

10 AufgabenInsgesamt 432 Minuten

Graded Quiz - Modules 9 and 1060 Minuten
SGA-2: Odd-Even Transposition Sort Parallelisation 300 Minuten
Sample Sort and Bucket Sort9 Minuten
Map (Quiz)9 Minuten
Implementing Sample Sort Using OpenMP: First Implementation9 Minuten
Implementing Sample Sort Using OpenMP: Second Implementation9 Minuten
Implementing Sample Sort Using Pthreads9 Minuten
Implementing Sample Sort Using MPI9 Minuten
Implementing Sample Sort Using MPI: Example9 Minuten
Implementing Sample Sort Using CUDA9 Minuten

1 DiskussionsthemaInsgesamt 30 Minuten

Parallel Sample Sort Across Platforms30 Minuten

Final Comprehensive Examination

Das ist alles enthalten

1 Aufgabe

Dozenten

Kunal Kishore Korgaonkar

Birla Institute of Technology & Science, Pilani

2 Kurse1.945 Lernende

Prof. Gargi Prabhu

Birla Institute of Technology & Science, Pilani

1 Kurs62 Lernende

von

Birla Institute of Technology & Science, Pilani

Mehr von Algorithms entdecken

Status: Vorschau
Birla Institute of Technology & Science, Pilani
Multicore and GPGPU Programming
Kurs
Packt
GPU Programming with C++ and CUDA
Kurs
Status: Kostenloser Testzeitraum
Johns Hopkins University
Introduction to Concurrent Programming with GPUs
Kurs
Status: Kostenloser Testzeitraum
Johns Hopkins University
Introduction to Parallel Programming with CUDA
Kurs

Warum entscheiden sich Menschen für Coursera für ihre Karriere?

Felipe M.

Lernender seit 2018

„Es ist eine großartige Erfahrung, in meinem eigenen Tempo zu lernen. Ich kann lernen, wenn ich Zeit und Nerven dazu habe.“

Jennifer J.

Lernender seit 2020

„Bei einem spannenden neuen Projekt konnte ich die neuen Kenntnisse und Kompetenzen aus den Kursen direkt bei der Arbeit anwenden.“

Larry W.

Lernender seit 2021

„Wenn mir Kurse zu Themen fehlen, die meine Universität nicht anbietet, ist Coursera mit die beste Alternative.“

Chaitanya A.

„Man lernt nicht nur, um bei der Arbeit besser zu werden. Es geht noch um viel mehr. Bei Coursera kann ich ohne Grenzen lernen.“

Häufig gestellte Fragen

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Weitere Fragen

Besuchen Sie die das Hilfe-Center für Kursteilnehmer.

Finanzielle Unterstützung verfügbar,