Continual Learning is important, human beings can always learn new things based on existing memory. In real life, we are often opposed to a new task which is quite related to previous learned tasks, but still with some difference.
For machine learning, an optimizer for one domain usually requires millions of episodes of training. When a new task comes, it is not wise to train a new optimizer from scratch when a closely related task is already trained. Instead, it can actually learn to transfer the knowledge of previously trained algorithm to the new task, to speed up the new optimizer training. We call the process “Algorithm Transfer”.
Our architecture achieves such kind of transfer through memory networks, by adding one more time scale over the learning-to-learn framework. Before training a new task, we selectively choose the knowledge learned from previous tasks from the memory through an attention mechanism, and integrate the info with the current model.