Paper Summary: “Artificial Evolution of Plastic Neural Networks: a few Key Concepts”

A beginner’s guide to Plastic Neural Networks

9 min readAug 31, 2021

Abstract:

This paper is a survey of the works that mix NeuroEvolution and Synaptic Plasticity. They take all the concepts and format them in a hierarchy to compare and classify them.

Introduction:

One way that nature differs from ML & robots is nature’s ability to adapt to new scenarios and environments — This ability is broken up into 3 main components and how we can apply it to ML:

How biology learns → creating new learning mechanisms for ML
How biology evolves → automatically creating NN’s
How synaptic plasticity works → create agents that adapt during their lifetime to their environments

All these components inspire ML counterparts — but have been applied independently of each other.

The goal of this paper is to define definitions of all these works. Some definitions and distinctions are unique — but the main gem of this paper is presenting everything in a coherent framework — which is below:

Synaptic Plasticity:

Plasticity is the ability of the brain to change structurally and functionally when it interacts with the environment. Typically we see this during phases of development and learning. And there are 2 subcategories of plasticity:

Structural Plasticity: When we create new connections, thus changing the topology (aka structure) of the network
Synaptic Plasticity (functional plasticity): When we change the weights between connections

An example of structural plasticity is when one paper encoded instructions on how to create the neural network — it described where neurons were and instructions on how it grew (aka if it was stimulated by surrounding neurons enough then it would grow / by the environmental triggers). Thus the same set of rules and gave rise to different phenotypes (actions) depending on the environment!

A lot of studies have been on synaptic plasticity — and when they aren’t using backpropagation, they use Hebb’s rule: “Cells that fire together wire together”. And there are many variants of Hebb’s rule that follow that basic theme:

Hebb’s Rule:

i and j are 2 neurons. i is in one layer, and j is in the layer after it
w_ij is the weight between i and j
therefore w_ij(t) is the current weight between i and j, and w_ij(t+1) is the updated weight
∆w_ij is the shift in the weight (we add this to our old weight in order to get our new one)
∆w_ij is equal to η (the learning rate) x a_i (activation of the i neuron) x a_j (activation of the j neuron)

Extended Hebbian Rule:

The top part is the same, where you just take the old weight (w_ij(t) and you add ∆w_ij). But we can have different equations for ∆w_ij:

There are different f() (functions) which we can plug stuff into. But the simplest version is a linear combination of the i and j neurons:

Simple function on how to find ∆w_ij

A, B, C, D are just parameters (for real numbers). You can either just hard code these values, or let the network learn them themselves. It can be for all neurons, or for individual ones.

Modulated Hebbian Rule:

Visualization of the modulatory and modulated neurons

Of course, nature is more complex than this. There are things called modulatory neurons that affect the strengthening and weakening of the connections — they don’t provide inputs, they only affect connections.

In the diagram above you can see, there are 2 types of inputs to neuron j: the modulatory neurons (k) and the modulated neurons (i). Alright, now it’s time for the fancy formula!

In the big picture, the modulated neurons are just replacing the learning rate (/how much we scale the function).
I_j^(m) = all the modulatory neurons (all the k neurons)
We iterate through all the modulatory neurons, grab their weight and multiply by the activation of k
We sum it all up, shove it through the tanh function and we have m_j (the learning rate thing)
The bottom left with the ∀ = we’re going to iterate through all the input neurons (all the i neurons) and then change their weights as usual, except with the new m_j multiple

What’s the advantage of this? We mimic nature closer, but we could represent the modulatory neurons as rewards. So we could adjust it such that plasticity is only turned on whenever there’s a reward coming in. Or we could just only allow certain parts of the network to be plastic during certain times (like during the lifetime, at the end of epochs, etc)

And modulated Hebbian Plasticity has shown experimental benefit versus when we don’t have them.

Robustness and Reward-Based Scenarios:

In the real world, nothing is constant. Environments change all the time, organisms degrade through wear and tear. Animals are able to adapt to these new environments incredibly well — so why don’t we do the same with machines? That’s behavioural robustness:

Behavioural Robustness:

When the agent is able to keep the same behaviour and withstanding the test of change (environment and themselves). There usually isn’t any reward. One paper did just that but for 4 different environments — the plastic network was able to adapt to all 4, while the fixed network was not.

But one problem with that study was that it didn’t reflect nature. The weights continuously changed as it switched tasks. But in nature, we see that the synaptic weights tend to hold the same value over long periods of time.

Behavioural Changes:

But what if we wanted Behavioral Robustness + rewards/punishments → Behavioral Changes.

One example would be if there was a maze in the shape of a T. Let’s place a reward at one of the ends, and let the network run. At first, it will be random, but once it gets the reward a few times it will go directly to the reward. If we change where the reward is it will learn to go to the new location.

Thus Behavioral change = when we change the reward, it will change its behaviour + the weights won’t be changing that much once we hit optimum behaviour.

Learning Abilities in Discrete Environments:

What if we wanted to step it up a notch and get it to adapt to scenarios which it has never seen before. To help explain all the definitions that follow (there’s a lot of symbols that’s coming) we have to talk about the Skinner Box:

There’s an agent in a cage. This cage has some n stimuli (lights), m actions (lights), and rewards (food) / punishment (electric shock). The goal of the agent is to learn the correct associations between stimuli and actions. And now for some fancy symbols:

Agent = a neural Network = N(I, λ). It takes in inputs (I) and has weights (λ)
The agent tries to map the inputs I (which have a range of [0,1]^n) to the optimal outputs K (which has a range of [0,1]^m)

The weights are updated through that equation ^^. We input a random set of weights (λ_r), the inputs (I) and the reward function (R_I,K)

(Note: This whole section is only for discrete worlds) Now it’s time for a ton of definitions:

Association Set:

It’s literally the set of all inputs and outputs. And we can get a set of all association sets: 𝔸

Fitness Associations Set:

When we’re evaluating the fitness of networks, we’re going to be grabbing association sets. That’s what the Fitness Associations Sets are:

Learnable Set:

Sometimes network structures don’t allow them to learn all Association sets. (a simple network (all inputs map to the output with no hidden layers) won’t be able to get very far). So neural networks have a learnable set, which contains all the association sets it can learn.

We define this formally by:

1. We iterate through all association sets and look at their (input, output)

2. Then we ask: “Are there any random set of weights, that can eventually be learned such that we can get this (input, output)?”

3. If we can then it’s part of the learnable set! If not then it’s not part of a learnable set

The learnable set of a particular neural network (N) is represented by ^^

Synaptic General Learning Abilities (sGLA):

A neural network is able to take on all association sets!

A multi-layered perception + backprop is one example of sGLA!

Evolution of Behavioural Switches:

If we want to evolve only sGLAs → We can just check if the evolved network’s Fitness Associations has the property of sGLA’s (they can learn anything they want):

Evolution of sGLA for Unknown Situations:

There are 2 properties, the association set that we’re training it on (fitness associations) isn’t everything. Plus the network we’re training can take any form:

Evolution of sGLA for Unknown Situations

In nature we have a large population that’s diverse — and they’re all somewhat randomized → That way it can obtain all scenarios. For neural networks, we can just have a large population of randomized association sets to try and get sGLA.

Synaptic Translative Learning Abilities (sTLA):

There’s also importance on the encodings of the structures — since it can decrease the number of evaluations to get GLA. (so we only train it on a few association sets and it can generalize to all of them)

Now with the fancy symbols: