Prototyping language models in Julia
- 0 Collaborators
This project encompasses my ongoing research into language models, looking to demonstrate the potential and flexibility of doing such research within the Julia ecosystem. ...learn more
Project status: Concept
Intel Technologies
Intel CPU,
oneAPI
Overview / Usage
This project encompasses my ongoing research into language modeling. This primarily involves work with Transformer-like architectures and variants, but will also involve other architectures looking to dethrone its ubiquity in modern NLP.
Although I have substantial experience with other neural network frameworks (particularly PyTorch), I am instead intrigued by the possibility that such work can be instead pursued in Julia: With first-class multi-processing and distributed support, strong multi-vendor GPU support (with the ability to program compute kernels directly from the language itself), and excellent support for automatic differentiation, I feel that it is unfortunate that it is often overlooked in numerical computing for more popular Python-centric frameworks.
Methodology / Approach
I aim to make my pursuits as hardware accessible as possible, and aim to make it executable on both CPUs and GPUs alike. Once I complete small-scale prototypes, I would like to eventually be able to scale to multi-device and multi-node contexts.