HPC

Drive faster breakthroughs through faster code: Get more results on your hardware today and carry your code forward to the future with code modernization.

矩阵分块乘法的并行实现以及缓存优化

URL: https://gitee.com/mars0417/matrix_block_multiplication

Description:

矩阵乘法的一般形式是C=A*B,使用串行计算的矩阵乘法需要三层循环,操作次数为2*M*N*K。 为了减少矩阵乘法的计算成本,本实验决定对计算结果的矩阵C进行分块,分成大小相同的若干块,这些工作块的计算分别要用到矩阵A和矩阵B的部分行和列。由gpu并行进行工作块的计算,从而达到矩阵分块乘法的并行实现,提高矩阵乘法的效率。 在将矩阵分块计算的时候,读取内存的次数并没有改变,只是利用gpu并行计算将计算时间缩短。因此设置缓存块,在每一块的运算中,将工作块按照缓存块大小分割,单次循环缓存足以计算出缓存块大小的矩阵A和矩阵B的数据,减少迭代次数,从而减少访问内存的次数,实现矩阵分块乘法的缓存优化。

Posted:

基于oneAPI平台的SpMV算法研究

URL: https://github.com/jason-designer/SpMV-using-oneAPI

Description:

稀疏矩阵与密集向量的乘法(SpMV)是一种重要的科学运算,是许多应用的性能瓶颈,因此提高SpMV的计算速度有重要意义。目前对SpMV的研究多基于GPU平台或multicore平台,对于近年来Intel逐渐成熟的oneAPI高性能计算平台的研究则较少,虽然oneAPI平台提供了SpMV的算法库,但其性能较弱。因此本项目基于oneAPI平台实现了LightSpMV算法,并为了适应oneAPI平台对其进行性能调优,调优后的算法相比于oneAPI平台提供的SpMV算法性能提高了1.97倍。算法代码已整理完毕,开箱即用。LightSpMV性能调优代码也整理完成,若后续Intel提供的硬件有变更也可以使用

Posted:

OneAPI_homework

URL: https://github.com/Xiaozaichen/OneAPI_homework

Description:

使用intel oneAPI AI Analytics Toolkit对connect-4数据集加速使用支持向量机分类,connect-4数据集是记录了棋盘各种分布,并且带有输赢和平局的标签,使用该数据集训练SVM然后进行预测。 使用intel oneAPI HPC Toolkit对高斯消去法的矩阵计算加速

Posted:

Beautify Me Medical Center Dubai

URL: https://beautifymemedicalcenter.com/

Description:

The goal of Beautify Me Medical Center is to become known as one of the best cosmetic clinics in the Emirates. We are really grateful to have a talented team of medical professionals with a variety of specialties in each area. We provide you the cosmetic therapy you require to boost your beauty and

Posted:

基于OneAPI的并行化快速选择TopK算法实现

URL: https://github.com/MichaelTenma/TopK/tree/main

Description:

TopK是指在若干个数的序列中,找出K个最小(或最大)的数。本项目借助OneAPI在CPU多个核心上实现TopK算法的并行计算。本文通过并行快速选择算法寻找K个最小值,实现关键点在于将数序列划分成L块,每块的大小为B,对每块都进行快速选择算法,得出每块的前K小值,然后再对L块的全部前K小值,总计K*L个值,再进行快速选择,找出最终的前K小值,对于不同块而言,可以在不同的CPU核心上并行计算,以提高运算性能。

Posted:

Parallel Python, C/C++ and FORTRAN applied to physics simulations

URL: https://github.com/juanjoseleongil/parallelProject

Description:

This project aims to develop highly optimized code for physics simulations using Python, C/C++ and FORTRAN, with runtime comparisons between serial and parallel code, as well as compilation differences between different architectures

Posted:

Exact State Reconstruction of Linear Iterative Solvers with NVRAM

URL: https://github.com/Scientific-Computing-Lab-NRCN/In-NVRAM-ESR.git

Description:

This project includes the implementation of In-NVRAM Exact State Reconstruction (ESR) for the PCG solver, as we describe in https://arxiv.org/pdf/2204.11584.pdf. In this work we also plan to research recoverability (with NVRAM) of concurrent applications using OpenMP.

Posted:

swsharp_sycl

URL: https://github.com/ManuelCostanzo/swsharp_oneapi

Description:

This project is based on SW# (https://github.com/mkorpar/swsharp), a CUDA software for biological sequence alignment. We migrated the CUDA software to SYCL using the oneAPI dpct tool and applied our modifications to the parts that the tool could not migrate.

Posted: