Backpropagation is the method used to iteratively approximate a multi layer neural network to some underlying function by modifying the network's neural connection weights. It starts by computing the error function (cost function) for the network with respect to the connection weights, W, at every iteration as follows:

 

image002,

 

where image004 is the ith target value, image006 the output of the ith neuron of the output layer, and N the number of target values.

 

The process continues to calculate the derivatives of this function at every network layer, starting at the output layer and continuing in reverse order as follows:

 

·For the output layer:

The derivative of the error function for the ith neuron of the output layer L with respect to image008, the weight for the output of the jth neuron in the previous layer L-1, is given by:

 

image010

 

Where image012 is the output of the jth neuron of the previous layer L-1, image014 is an optional scaling coefficient, image016 the derivative of the activation function, image018 the number of neurons in the previous layer, and image020 the weight for the output of the kth neuron of the previous layer.

 

If we let

image022

then

 

image024

 

·For every other layer except the input layer:

 

The partial derivatives for a hidden layer, image026, may be computed when the image028 values for the following layer are known:

 

image030

image032

 

Where image034.

 

Once the partial derivatives of the error function are known, the neural connection weights are updated in the opposite direction of the value of the derivative with the intention of modifying the weights to a value where the derivative of the error function is zero, or the point at which in theory the function reaches a minima and the network attains an optimal approximation to the underlying function:

 

·If the momentum rate is zero:

 

image036

 

·If the momentum rate is based on the last change:

 

image038

 

·If the momentum rate is based on the last gradient:

 

image040

 

Where image042 is the learning rate, image044 is the momentum rate, and image046 and image048 refer to values calculated in this update and in the previous update, respectively.

 

Sponsored

Try Predictive Systems Lab

Interactive Demo →

Need Help?
Contact support