OMPify

Tal Kadosh

Tal Kadosh

Unknown

OMPify is a tool for creating a comprehensive database that contains multiple modalities of code, including source code, AST format, etc. Using this database, OMPify can be used to train large language models, allowing them to learn the semantics of code and generate OpenMP pragmas automatically. ...learn more

Project status: Under Development

oneAPI, HPC, Artificial Intelligence

Intel Technologies
oneAPI

Code Samples [1]

Overview / Usage

In order to exploit the full potential of multi-core architectures, which are fundamental components of modern computing, there is an ever-present need for shared memory parallelization schemes. Nowadays, the most common parallelization API addressing this task is OpenMP. Although this API is comprehensive, code parallelization remains a challenging task.

To address this challenge, many source-to-source (S2S) compilers have emerged in recent years, attempting to automate the process. However, these compilers have limitations, such as being time-consuming and having limited robustness to inputs.

We developed a tool that creates a database of code gathered from GitHub, which includes the source code, AST format, and OpenMP pragmas related to any for-loops present in the code. This database, called Open-OMP, provides a comprehensive collection of code files that use OpenMP pragmas, making it a valuable resource for anyone interested in parallelizing their code using OpenMP.

To automate the process of parallelization even further, we plan to leverage large language models that can effectively learn the semantics of code and generate OpenMP pragmas automatically. By using multi-modal input, our language models can take into account not only the code itself but also other relevant information, such as the structure of the AST and the context in which the code is used.

OMPify is a tool designed to leverage the latest advances in natural language processing models to automatically generate OpenMP pragmas.

Methodology / Approach

Our methodology for developing OMPify involves creating a high-quality database containing as little noise as possible, and examining different LLMs that exploit different aspects of code, such as structure and variable dependence. We also employ techniques such as transfer learning and fine-tuning to improve the accuracy and efficiency of our models.

Repository

https://github.com/Scientific-Computing-Lab-NRCN/OMPify

Collaborators

1 Result

1 Result

Comments (0)