Parallel_Programming_using_OpenCL_on_Modern_Archit.pdf

(1592 KB) Pobierz
See discussions, stats, and author profiles for this publication at:
https://www.researchgate.net/publication/261949566
Parallel Programming using OpenCL on Modern Architectures
Book
· January 2012
CITATIONS
READS
0
3 authors,
including:
Allan Svejstrup Nielsen
École Polytechnique Fédérale de Lausanne
9
PUBLICATIONS
   
26
CITATIONS
   
SEE PROFILE
478
Allan Peter Engsig-Karup
Technical University of Denmark
67
PUBLICATIONS
   
416
CITATIONS
   
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Improving fidelity of Reservoir Simulation
View project
Unstructured nodal DG-FEM solution of high-order Boussinesq-type equations
View project
All content following this page was uploaded by
Allan Peter Engsig-Karup
on 24 September 2015.
The user has requested enhancement of the downloaded file.
IMM Technical Report 2012-05
Parallel Programming using
OpenCL on Modern Architectures
Allan Svejstrup Nielsen
Allan P. Engsig-Karup & Bernd Dammann
Abstract
This report is intended as a quick introduction to the OpenCL framework and the aim is to
facilitate a smooth transfer into the use OpenCL C for developers with previous GPGPU
experience. The purpose of OpenCL is to allow for developers to use all compute resources
available on a heterogeneous hardware platform.
As well as being an introduction to OpenCL, the report also presents an overview of AMD
GPU hardware, covering both the VLIW5/4 architectures and the upcoming Graphics-Core-
Next architecture which is to form the basis of AMDs future generation GPUs that are to
be as capable at compute as they are at graphics.
To conclude the presentation of OpenCL as a language for compute, a matrix-matrix mul-
tiplication example is devised and optimized for the VLIW4, Tesla and Fermi architectures.
The performance is measured as a function of both matrix and work-group size and res-
ults are discussed. Where applicable, the equivalent CUDA implementation is tested for
comparison.
GPUlab
DTU Informatics
16th March 2012
Preface
The report at hand was written as part of a 5 ECTS special topic course at DTU Department
of Informatics and Mathematical Modeling (DTU Informatics) with Allan P. Engsig-Karup
and Bernd Dammann as advisers.
Part of the CUDA code tested and presented within the report was developed during another
5 ECTS course at DTU, 02614 High Performance Computing.
The report represents a brief condensation of much of the current movements within OpenCL
and heterogeneous computing.
Contents
I
Heterogeneous Computing and OpenCL
1
1
2
2
4
4
5
7
8
8
1 Scientific Computing Demands Throughput
1.1
1.2
2
A New Market for GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Open Compute Language . . . . . . . . . . . . . . . . . . . . . . . . . . .
Trends and Industry Movement
2.1
2.2
2.3
2.4
Programming Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Current Development in GPU Hardware . . . . . . . . . . . . . . . . . . . .
OpenCL Gaining Ground
Compute Accelerators
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Outlook
II
The OpenCL Framework
9
9
9
10
10
11
11
13
13
14
15
15
16
17
17
18
18
20
20
4 An Introduction
4.1
4.2
Host and Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conceptual Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1
4.2.2
4.2.3
4.3
4.4
4.5
The Platform Model . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Execution Model . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Command-Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Programs and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Computing with OpenCL
5.1
5.2
A General Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Code Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6 The Kernel Programming Language
6.1
6.2
6.3
6.4
OpenCL C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Types and Qualifiers
Built-In Functions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kernel Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 Summary
Zgłoś jeśli naruszono regulamin