Drive faster breakthroughs through faster code: Get more results on your hardware today and carry your code forward to the future with code modernization.
In the age of AI, algorithms must efficiently cope with vast data sets. We propose a performance-portable implementation of Locality-Sensitive Hashing (LSH), an approximate k-nearest neighbors algorithm, using different SYCL implementations—ComputeCpp, hipSYCL, DPC++—supporting multiple GPUs.
In this project, we aim to find if FPGAs can be considered as a viable option as a hardware accelerator, and if so, how is their performance compared to existing processors like GPGPUs in various types of HPC workloads. We have an opportunity to take benefit of the recent developments in High-Level
Mediamapper project plans to produce map of topics on social media based on keywords, hashtags, titles, labels, captions, search-words as well as images and videos. The name MediaMapper signifies the mapping of topics and trends on the overall social media all over the Internet.
This project is modified from the integral project from the course "Fundamentals of Parallelism on Intel Architecture" by Dr. Andrey Vladimirov in Coursera. The codes are converted to C++ and DPC++.
ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. This project is to develop a oneAPI backend to the library, which currently supports CUDA, OpenCL, and x86.
OCCA—an open source, portable, and vendor neutral framework for parallel programming on heterogeneous platforms—is used by mission critical computational science and engineering applications of public and private sector organizations, including the U.S. Department of Energy and Shell.
The objective of the project is to obtain a stream of h265 videos encoded by external devices or by the server itself. But with a big difference: the stream or video will be indexed by models of convolutional neural networks. The system will find specific scenes without evaluating all media content.
In project OneOligo, we are using OneAPI for implementing scalable, heterogeneous-parallel-processing algorithms that can be used quickly and accurate decode digital data stored in synthetic DNA generated by project OligoArchive.
The computing time required to process large data matrices may become impractical, even for a parallel application running on a multiprocessors cluster. NMF-DPC++ is an efficient and easy-to-use implementation of the NMF algorithm that takes advantage of the high computing performance through SYCL.
The introduction of SYCL/OneAPI in the OP2 DSL for unstructured mesh computations. The project evaluates different execution strategies for different target hardware and compilers.