Skip to content

Commit 24cdc7b

Browse files
GordonGordon
authored andcommitted
Add exercises as markdown files in repository.
1 parent d279736 commit 24cdc7b

File tree

4 files changed

+321
-0
lines changed

4 files changed

+321
-0
lines changed
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Parallelism in Modern C++; from CPU to GPU
2+
### Exercise 0: Setting up ComputeCpp
3+
4+
---
5+
6+
In this exercise you will learn:
7+
* How to install ComputeCpp and its dependencies.
8+
* How to create and compile a SYCL application using ComputeCpp.
9+
10+
---
11+
12+
1.) Install ComputeCpp
13+
14+
ComputeCpp is an implementation of SYCL 1.2.1 developed by Codeplay. It is the first complete and conformant implementation of the standard and a free community edition is available to download.
15+
16+
You can download the ComputeCpp CE package from the [developer portal][computecpp-developer-portal] (you will need to create an account). ComputeCpp CE is available on Windows 7/10, Ubuntu 14.04/16.04 and CentOS.
17+
18+
If you do not have a supported operating system you can download and run the Ubuntu 16.04 docker image. To do this you'll need to first install [Docker][docker-installer] and then once you have your Docker environment configured pull and run the ComputeCpp Ubutnu 16.04 docker image:
19+
20+
```
21+
docker pull aerialmantis/computecpp_ubuntu1604
22+
docker run --rm -it aerialmantis/computecpp_ubuntu1604
23+
```
24+
25+
2.) Install dependencies
26+
27+
ComputeCpp supports a wide range of OpenCL devices, however, in addition to installing ComputeCpp you must also install the relevant OpenCL drivers for the device you wish to run on.
28+
29+
If you are using the docker image, the Intel OpenCL drivers will already be installed so you can skip this step.
30+
31+
Please follow the links before to find the appropriate OpenCL drivers for the device you wishing to use:
32+
* [Intel OpenCL drivers][intel-drivers]
33+
* [AMD OpenCL drivers][amd-drivers]
34+
* [Nvidia OpenCL drivers][nvidia-drivers]
35+
36+
Please note that:
37+
* Not all AMD devices will be supported by ComputeCpp as some more recent AMD drivers do not support SPIR.
38+
* Nvidia GPU support is still experimental and is not yet available on Windows.
39+
40+
For the purposes of this class, we recommend that you use a GPU, however, all exercises can be done using a CPU device or event the host device (an emulated OpenCL device available with ComputeCpp that can be used when OpenCL is not available). However, note that performance will vary between the CPU and the GPU.
41+
42+
3.) Check your installation
43+
44+
The ComputeCpp CE package comes with a tool called computecpp_info, which can be used to list all of the devices available on your machine that can be used by ComputeCpp. This can be used to check that you have the appropriate drivers installed to allow you to use your chosen device. It will also inform you whether that particular device is tested for ComputeCpp CE.
45+
46+
When you run computecpp_info you will get something like:
47+
48+
```
49+
********************************************************************************
50+
51+
ComputeCpp Info (CE 1.1.4)
52+
53+
SYCL 1.2.1 revision 3
54+
55+
********************************************************************************
56+
57+
58+
Device Info:
59+
60+
Discovered 2 devices matching:
61+
platform : <any>
62+
device type : <any>
63+
64+
--------------------------------------------------------------------------------
65+
Device 0:
66+
67+
Device is supported : NO - Device does not support SPIR
68+
CL_DEVICE_NAME : GeForce GTX 1060 6GB
69+
CL_DEVICE_VENDOR : NVIDIA Corporation
70+
CL_DRIVER_VERSION : 417.35
71+
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU
72+
--------------------------------------------------------------------------------
73+
Device 1:
74+
75+
Device is supported : YES - Tested internally by Codeplay Software Ltd.
76+
CL_DEVICE_NAME : Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
77+
CL_DEVICE_VENDOR : Intel(R) Corporation
78+
CL_DRIVER_VERSION : 6.4.0.37
79+
CL_DEVICE_TYPE : CL_DEVICE_TYPE_CPU
80+
81+
If you encounter problems when using any of these OpenCL devices, please consult
82+
this website for known issues:
83+
https://computecpp.codeplay.com/releases/v1.1.4/platform-support-notes
84+
85+
********************************************************************************
86+
```
87+
88+
You can also add the option --verbose to display further information about the devices.
89+
90+
From this you should confirm that you have a supported platform and that the CE version is 1.1.4.
91+
92+
4.) Compile a SYCL application
93+
94+
Once you have ComputeCpp and the OpenCL devices for your chosen device installed you are ready to compile your first SYCL application.
95+
96+
The best way to do this is to download this repository and build the first example.
97+
98+
Clone this repository.
99+
100+
```
101+
git clone https://github.com/AerialMantis/cppcon-parallelism-class.git
102+
```
103+
104+
Create a build directory.
105+
106+
```
107+
cd cppcon-parallelism-class
108+
mkdir build
109+
cd build
110+
```
111+
112+
Run CMake to configure the solutions (local setup)
113+
114+
```
115+
cmake ../ -G[Generator] -DCMAKE_BUILD_TYPE=Debug
116+
```
117+
118+
Run CMake to configure the solutions (Docker setup)
119+
```
120+
cmake ../ -GNinja -DCMAKE_BUILD_TYPE=Debug -DOpenCL_LIBRARY=${OCL_LIB}/libOpenCL.so -DOpenCL_INCLUDE_DIR=${OCL_INC}
121+
```
122+
123+
Note that if you are using an NVidia GPU, in order to use the experimental ComputeCpp SYCL support you must include the following in the above cmake command:
124+
125+
```
126+
-DCOMPUTECPP_BITCODE=ptx64
127+
```
128+
129+
Note that you can disable the SYCL tests in the case you are not able to use ComputeCpp by adding the following in the above cmake command:
130+
131+
```
132+
-DCPPCON_SYCL_EXAMPLES=OFF
133+
```
134+
135+
Build your solution:
136+
137+
```
138+
cmake --build .
139+
```
140+
141+
Verify your setup by running the hello world example:
142+
143+
144+
```
145+
./examples/hello_world
146+
```
147+
148+
[computecpp-developer-portal]: https://developer.codeplay.com/home/
149+
[docker-installer]: https://docs.docker.com/install/
150+
[intel-drivers]: https://software.intel.com/en-us/articles/opencl-drivers
151+
[amd-drivers]: https://www.amd.com/en/support
152+
[nvidia-drivers]: https://developer.nvidia.com/cuda-toolkit-32-downloads
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Parallelism in Modern C++; from CPU to GPU
2+
### Exercise 1: Configuring a Queue
3+
4+
---
5+
6+
In this first exercise you will learn:
7+
* How to construct a queue using a device selector.
8+
* How to define and use your own device selector.
9+
10+
---
11+
12+
The first thing you must do in a SYCL application is to construct a queue that will enqueue work, and a queue is associated with a single device to which it enqueues work. The simplest way to construct a queue in SYCL is to pass it a device selector, that is then used to choose a device from all the devices available in your system.
13+
14+
1.) Create a queue using the default selector
15+
16+
Construct a queue using the default selector as follows:
17+
18+
```
19+
auto myQueue = cl::sycl::queue{cl::sycl::default_selector{}};
20+
```
21+
22+
2.) Print the name of the device that is chosen
23+
24+
Retrieve the chosen device from your queue and query the name of the device as follows:
25+
26+
```
27+
auto myDevice = myQueue.get_device();
28+
auto deviceName = myDevice.get_info<cl::sycl::info::device::name>();
29+
std::cout << Chosen device: << deviceName << std::endl;
30+
```
31+
32+
3.) Try other device selectors
33+
34+
Replace the default selector with one of the other standard device selectors that are provided by SYCL (see [SYCL 1.2.1 specification][sycl-specification], sec. 4.6.1.2).
35+
36+
4.) Create your own device selector
37+
38+
Create a device selector using the template below, implementing the function call operator, using various device info queries like the one we used earlier to query the device name (see SYCL 1.2.1 specification, sec. 4.6.4.2) and then use that device selector in the queue constructor:
39+
40+
```
41+
class my_device_selector : public device_selector {
42+
public:
43+
my_device_selector() {}
44+
45+
virtual int operator()(const device &device) const { }
46+
};
47+
```
48+
49+
Remember that the value returned will reflect the score for each device, and a device with a negative score will never be chosen.
50+
51+
52+
[sycl-specification]: https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf

docs/sycl_02_hello_world.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Parallelism in Modern C++; from CPU to GPU
2+
### Exercise 2: Hello World
3+
4+
---
5+
6+
In this first exercise you will learn:
7+
* How to create and submit a command group
8+
* How to define a SYCL kernel function
9+
* How to stream output from a SYCL kernel function
10+
11+
---
12+
13+
Once you have a queue you can now submit work for the device to execute, and this is done via command groups, which are made up of commands and data dependencies.
14+
15+
1.) Define a command group
16+
17+
Define a lambda to represent your command group and pass it to the submit member function of the queue as follows:
18+
19+
```
20+
myQueue.submit([&](cl::sycl::handler &cgh){
21+
22+
});
23+
```
24+
25+
Note that submitting a command group without any commands will result in an error.
26+
27+
2.) Define a SYCL kernel function
28+
29+
Define a SYCL kernel function via the single_task command within the command group as follows:
30+
31+
```
32+
cgh.single_task<hello_world>([=](){
33+
34+
});
35+
```
36+
37+
Remember to declare a class for your kernel name in the global namespace.
38+
39+
3.) Stream “Hello World!” to stdout from the SYCL kernel function
40+
41+
Construct a stream within the scope of the command group as follows:
42+
43+
```
44+
auto os = cl::sycl::stream{128, 128};
45+
```
46+
47+
Then use the stream you constructed within the SYCL kernel function to print “Hello world!” as follows:
48+
49+
```
50+
os << “Hello world!” << cl::sycl::endl;
51+
```
52+
53+
4.) Try another command
54+
55+
Instead of single_task try another command for defining a SYCL kernel function (see [SYCL 1.2.1 specification][sycl-specification], sec 4.8.5).
56+
57+
58+
[sycl-specification]: https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf

docs/sycl_03_vector_add.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Parallelism in Modern C++; from CPU to GPU
2+
### Exercise 3: Vector add
3+
4+
---
5+
6+
In this first exercise you will learn:
7+
* How to manage data using buffers
8+
* How to access data using accessors
9+
* How to define an nd range SYCL kernel function
10+
11+
---
12+
13+
In SYCL buffers are used to manage data across the host and device(s), and accessors are used to declare data dependencies to a SYCL kernel function as well as to access the data within a SYCL kernel function.
14+
15+
1.) Allocate your input and output vectors
16+
17+
Allocate memory on the host for your input and output data using std::vectors and initialise the input with values.
18+
19+
```
20+
auto input = std::vector<float>{};
21+
auto output = std::vector<float>{};
22+
23+
input.reserve(size);
24+
output.reserve(size);
25+
26+
std::iota(begin(input), end(output), 0.0f);
27+
std::fill(begin(input), end(output), 0.0f);
28+
```
29+
30+
2.) Construct buffers
31+
32+
Construct a buffer to manage your input and output data.
33+
34+
```
35+
auto inputBuf = cl::sycl::buffer<float, 1>(input.data(),
36+
cl::sycl::range<1>(intput.size());
37+
auto outputBuf = cl::sycl::buffer<float, 1>(input.data(),
38+
cl::sycl::range<1>(intput.size());
39+
```
40+
41+
3.) Construct accessors
42+
43+
Construct an accessor for your input and output buffers.
44+
45+
```
46+
auto inputAcc = inputBuf.get_access<cl:sycl::access::mode::read>(cgh);
47+
auto outputAcc = outputBuf.get_access<cl:sycl::access::mode::write>(cgh);
48+
```
49+
50+
4.) Declare your kernel
51+
52+
Declare a SYCL kernel function using the parallel_for command that takes ...
53+
54+
```
55+
cgh.parallel_for<vector_add>(range<1>(input.size()),
56+
[=](cl::sycl::id<1> id) {
57+
outputAcc[id] = inputAAcc[id] + inputBAcc[id];
58+
});
59+
```

0 commit comments

Comments
 (0)