GPU parallel program development using CUDA /
| Main Author: | |
|---|---|
| Corporate Author: | |
| Format: | eBook |
| Language: | English |
| Published: |
Boca Raton, FL :
CRC Press,
2018.
|
| Edition: | First edition. |
| Series: | Chapman & Hall/CRC computational science series.
|
| Subjects: | |
| Online Access: | Connect to the full text of this electronic book |
Table of Contents:
- Cover; Half Title; Series; Title Page; Copyright Page; Contents; List of Figures; List of Tables; Preface; About the Author; Part I: Understanding CPU Parallelism; Chapter 1: Introduction to CPU Parallel Programming; 1.1 EVOLUTION OF PARALLEL PROGRAMMING; 1.2 MORE CORES, MORE PARALLELISM; 1.3 CORES VERSUS THREADS; 1.3.1 More Threads or More Cores to Parallelize?; 1.3.2 Influence of Core Resource Sharing; 1.3.3 Influence of Memory Resource Sharing; 1.4 OUR FIRST SERIAL PROGRAM; 1.4.1 Understanding Data Transfer Speeds; 1.4.2 The main() Function in imflip.c
- 1.4.3 Flipping Rows Vertically: FlipImageV()1.4.4 Flipping Columns Horizontally: FlipImageH(); 1.5 WRITING, COMPILING, RUNNING OUR PROGRAMS; 1.5.1 Choosing an Editor and a Compiler; 1.5.2 Developing in Windows 7, 8, and Windows 10 Platforms; 1.5.3 Developing in a Mac Platform; 1.5.4 Developing in a Unix Platform; 1.6 CRASH COURSE ON UNIX; 1.6.1 Unix DirectoryRelated Commands; 1.6.2 Unix FileRelated Commands; 1.7 DEBUGGING YOUR PROGRAMS; 1.7.1 gdb; 1.7.2 Old School Debugging; 1.7.3 valgrind; 1.8 PERFORMANCE OF OUR FIRST SERIAL PROGRAM; 1.8.1 Can We Estimate the Execution Time?
- 1.8.2 What Does the OS Do When Our Code Is Executing?1.8.3 How Do We Parallelize It?; 1.8.4 Thinking About the Resources; Chapter 2: Developing Our First Parallel CPU Program; 2.1 OUR FIRST PARALLEL PROGRAM; 2.1.1 The main() Function in imflipP.c; 2.1.2 Timing the Execution; 2.1.3 Split Code Listing for main() in imflipP.c; 2.1.4 Thread Initialization; 2.1.5 Thread Creation; 2.1.6 Thread Launch/Execution; 2.1.7 Thread Termination (Join); 2.1.8 Thread Task and Data Splitting; 2.2 WORKING WITH BITMAP (BMP) FILES; 2.2.1 BMP is a NonLossy/Uncompressed File Format; 2.2.2 BMP Image File Format
- 2.2.3 Header File ImageStuff.h2.2.4 Image Manipulation Routines in ImageStuff.c; 2.3 TASK EXECUTION BY THREADS; 2.3.1 Launching a Thread; 2.3.2 Multithreaded Vertical Flip: MTFlipV(); 2.3.3 Comparing FlipImageV() and MTFlipV(); 2.3.4 Multithreaded Horizontal Flip: MTFlipH(); 2.4 TESTING/TIMING THE MULTITHREADED CODE; Chapter 3: Improving Our First Parallel CPU Program; 3.1 EFFECT OF THEPROGRAMMER ON PERFORMANCE; 3.2 EFFECT OF THE CPU ON PERFORMANCE; 3.2.1 InOrder versus OutOfOrder Cores; 3.2.2 Thin versus Thick Threads; 3.3 PERFORMANCE OF IMFLIPP
- 3.4 EFFECT OF THE OS ON PERFORMANCE3.4.1 Thread Creation; 3.4.2 Thread Launch and Execution; 3.4.3 Thread Status; 3.4.4 Mapping Software Threads to Hardware Threads; 3.4.5 Program Performance versus Launched Pthreads; 3.5 IMPROVING IMFLIPP; 3.5.1 Analyzing Memory Access Patterns in MTFlipH(); 3.5.2 Multithreaded Memory Access of MTFlipH(); 3.5.3 DRAM Access Rules of Thumb; 3.6 IMFLIPPM: OBEYING DRAM RULES OF THUMB; 3.6.1 Chaotic Memory Access Patterns of imflipP; 3.6.2 Improving Memory Access Patterns of imflipP; 3.6.3 MTFlipHM(): The Memory Friendly MTFlipH()