XJoin

Portable, parallel hash join implementation across diverse XPU architectures with oneAPI ...learn more

Project status: Under Development

oneAPI, HPC

Intel Technologies
oneAPI, DPC++, Intel Iris Xe MAX, Intel Integrated Graphics, DevCloud

Code Samples [1]Links [1]

Overview / Usage

Modern server hardware is increasingly heterogeneous with a diverse mix of XPU architectures deployed across CPU, GPU, and FPGAs. However, till date, database developers have had to rely on either proprietary, architecture-specific solutions (like CUDA), or low level, cross-architecture solutions that complicate development (like OpenCL). The lack of portable parallelism caused by the absence of a common high-level programming framework is one of the main reasons preventing a wider adoption of XPUs by database systems. In this project, we take the first steps towards solving this problem using oneAPI – a cross-industry effort for developing an open, standards-based unified programming model that extends standard C++ to provide portable parallelism across diverse processor architectures.

Methodology / Approach

We port a recently-proposed, highly-optimized, GPU-based hash join algorithm from CUDA to Data Parallel CPP (DPCPP). We then execute the hash join on multicore CPUs, integrated GPUs (Intel GEN9), and discrete GPUs (Intel DG1 and NVIDIA GeForce) without changing a single line of kernel code to show that DPCPP enables portable parallelism. We compare the performance of DPCPP kernels with hand-optimized CUDA kernels and model-based theoretical performance bounds to demonstrate the performance–portability trade off in using DPCPP.

Repository

https://github.com/Eug9/XJoin

Collaborators

1 Result

1 Result

Comments (0)