Gated Graph Recurrent Neural Networks
Luana Ruiz
Philadelphia, Pennsylvania
- 0 Collaborators
In this project, we propose a Graph Convolutional Recurrent Neural Network architecture specifically tailored to deal with problems involving graph processes, such as identifying the epicenter of an earthquake or predicting weather. ...learn more
Project status: Under Development
HPC, Networking, Artificial Intelligence
Intel Technologies
AI DevCloud / Xeon
Overview / Usage
The availability of ever-growing volumes of data — often referred to as big data — has propelled the use of neural network architectures in both engineering and less traditional fields, such as medicine and business consulting.
But learning from large datasets comes with a challenge: it requires complex models with many parameters which, on the one hand, are time and memory-intensive and, on the other, increase the risk of overfitting. To get around these issues, a lot of effort has been put into designing architectures that exploit the underlying structure of data using an amenable number of parameters. The first example are Convolutional Neural Networks (CNNs), which use banks of convolutional filters whose number of parameters is independent of the size of the input to extract shared features across grid-like signals (e.g. images). Then, there are Graph Convolutional Neural Networks (GNNs), which achieve the same purpose on graph data using graph convolutional filters also known as linear shift-invariant graph filters (LSI-GFs). A third example are Recurrent Neural Networks (RNNs), designed to process sequential data through the addition of a state or memory variable that stores past information. The sequences processed by RNNs are usually temporal processes, but they are rarely one-dimensional, i.e., they do not vary only in time. In particular, we will be interested in sequences that are best represented by graph processes. Graph processes model a variety of important problems; some illustrative examples are weather prediction from data collected at weather station networks and identifying the epicenter of an earthquake from seismic waves.
To deal with these scenarios, we introduce a Graph Convolutional Recurrent Neural Network (GCRNN) architecture where the hidden state is a graph signal computed from the input and the previous state using banks of graph convolutional filters and, as such, stored individually at each node. In addition to being local, in GCRNNs the number of parameters to learn is independent of time because the graph filters that process the input and the state are the same at every time instant. GCRNNs can take in graph processes of any duration, which gives control over how frequently gradient updates occur. They can also learn many different representations: a signal (whether supported on a graph or not) or a sequence of signals; a class label or a sequence of labels.
Our GCRNN architecture is further extended to include time gating variables analogous to the input and forget gates of Long Short Term Memory units (LSTMs), which are also implemented using GCRNNs. The objective of gating is twofold: on the input side, to control the importance given to new information, and on the state side, how much of the stored past information the model should “forget”. GCRNNs’ ability to learn both graph and time dependencies and the importance of the gating mechanism for long input sequences are demonstrated in experiments on synthetic and real-world data.
Methodology / Approach
We are using Python 3.6 and the PyTorch library to develop an architecture akin to that of traditional RNNs, but where the linear transformations are implemented using graph convolutional filters (https://arxiv.org/abs/1805.00165 ) to leverage the underlying topology of graph processes. Gating strategies taking into account the graph structure are also being devised , such as node and edge gating.
PyTorch is our library of choice for its easy integration with NumPy and because this project utilizes many of the tools available at our research lab's public GNN library ( https://github.com/alelab-upenn/graph-neural-networks).
This project is funded by Intel through the ISTC Wireless Autonomous Systems project and has already produced a conference publication at the 2019 European Signal Processing Conference to be held at La Coruna, Spain In September of this year. This conference paper can be found at https://arxiv.org/abs/1903.01888. A full journal version is under preparation and is the reason why we are requesting an access extension to the Intel DevAI Cloud.