Solving a 2d Laplace’s equation using DPC++
- 0 Collaborators
We will apply Laplace’s equation using DPC++ to solve the electrical potential {U(x)} in a 2dimensional region with boundaries at fixed voltages ...learn more
Project status: Published/In Market
Intel Technologies
DevCloud,
oneAPI,
DPC++,
Intel Arc,
Intel Iris Xe MAX
Overview / Usage
Condition
This code uses a finite difference scheme to solve Laplace's equation for a square matrix distributed over a square (logical) processor topology. A complete description of the algorithm is found in the reference by Fox.
This code works on the SPMD (single code, multiple data) paradigm. It illustrates 2D block decomposition, nodes exchanging edge values, and convergence checking.
Each matrix element is updated based on the values of the four neighboring matrix elements. This process is repeated until the data converges, that is, until the average change in any matrix element (compared to the value 20 iterations previous) is smaller than a specified value.
To ensure reproducible results between runs, a red/black checkerboard algorithm is used. Each process exchanges edge values with its four neighbors. Then new values are calculated for the upper left and lower right corners (the "red" corners) of each node's matrix. The processes exchange edge values again. The upper right and lower left corners (the "black" corners) are then calculated.
Methodology / Approach
Use Case for DPC++
The code is currently configured for a 48x48 matrix distributed over four processors. It can be edited to handle different matrix sizes or number of processors, as long as the matrix can be divided evenly between the processors.
Variations for Laplace’s Equation
Gauss-Seidel and solve it iteratively.
Gauss–Seidel or Successive Displacement Method
Gauss–Seidel method is an improved form of Jacobi method, also known as the successive displacement method.
**DPC++ Implementation **
Including the headers
#include <CL/sycl.hpp>
#include "dpc_common.hpp"
using namespace sycl:
Device and Queue
A default selector is chosen, which means that the runtime selects the target device to run the kernel on. This results in use of a GPU, if present, otherwise the host CPU is used. To force a certain device, use cpu_selector or gpu_selector instead of default_selector. A queue is then defined based on the selector and an exception handler, which is wrapped in a try-catch block.
We will also have Kernel code and kernel execution.
We will also target FPGA frameworks
Implementing parallel for
h.parellel_for (int iter = 0; iter < num_iter; iter++) // iterations
{
for (int i = 1; i < (size - 1); i++) // x-direction
{
for (int j = 1; j < (size - 1); j++) // y-direction
4
{
potl[i][j] = 0.25 * ( potl[i+1][j] + potl[i-1][j]
+ potl[i][j+1] + potl[i][j-1] );
}
}
}
Technologies Used
Intel One API
Intel DevCloud