Chapter 11 The Neural Network Approach:
This is where we will discuss the FFN approach.
Much to do!
11.1 The Neural Network ToolBox in Matlab:
%
% As usual, set our paths
%
>> path(path,'/local2/petersj/eigenfaces/Images');
%
% Let's look at the help topics for the
% Neural Network toolbox:
%
>> help nnet
Neural Network Toolbox.
Version 3.0.1 (R11) 01-Jul-1998
Adapt functions.
adaptwb - By-weight-and-bias network adaption function.
Analysis functions.
errsurf - Error surface of single input neuron.
maxlinlr - Maximum learning rate for a linear layer.
Distance functions.
boxdist - Box distance function.
dist - Euclidean distance weight function.
mandist - Manhattan distance weight function.
linkdist - Link distance function.
Layer initialization functions.
initnw - Nguyen-Widrow layer initialization function.
initwb - By-weight-and-bias layer initialization function.
Learning functions.
learncon - Conscience bias learning function.
learngd - Gradient descent weight/bias learning function.
learngdm - Gradient descent w/momentum weight/bias learning function.
learnh - Hebb weight learning function.
learnhd - Hebb with decay weight learning function.
learnis - Instar weight learning function.
learnk - Kohonen weight learning function.
learnlv1 - LVQ1 weight learning function.
learnlv2 - LVQ2 weight learning function.
learnos - Outstar weight learning function.
learnp - Perceptron weight/bias learning function.
learnpn - Normalized perceptron weight/bias learning function.
learnsom - Self-organizing map weight learning function.
learnwh - Widrow-Hoff weight/bias learning rule.
Line search functions.
srchbac - Backtracking search.
srchbre - Brent's combination golden section/quadratic interpolation.
srchcha - Charalambous' cubic interpolation.
srchgol - Golden section search.
srchhyb - Hybrid bisection/cubic search.
New networks.
network - Create a custom neural network.
newc - Create a competitive layer.
newcf - Create a cascade-forward backpropagation network.
newelm - Create an Elman backpropagation network.
newff - Create a feed-forward backpropagation network.
newfftd - Create a feed-forward input-delay backprop network.
newgrnn - Design a generalized regression neural network.
newhop - Create a Hopfield recurrent network.
newlin - Create a linear layer.
newlind - Design a linear layer.
newlvq - Create a learning vector quantization network.
newp - Create a perceptron.
newpnn - Design a probabilistic neural network.
newrb - Design a radial basis network.
newrbe - Design an exact radial basis network.
newsom - Create a self-organizing map.
Net input functions.
netprod - Product net input function.
netsum - Sum net input function.
Net input derivative functions.
dnetprod - Product net input derivative function.
dnetsum - Sum net input derivative function.
Network initialization functions.
initlay - Layer-by-layer network initialization function.
Performance functions.
mae - Mean absolute error performance function.
mse - Mean squared error performance function.
msereg - Mean squared error with regularization performance function.
sse - Sum squared error performance function.
Performance derivative functions.
dmae - Mean absolute error performance derivatives function.
dmse - Mean squared error performance derivatives function.
dmsereg - Mean squared error w/reg performance derivative function.
dsse - Sum squared error performance derivative function.
Plotting functions.
hintonw - Hinton graph of weight matrix.
hintonwb - Hinton graph of weight matrix and bias vector.
plotbr - Plot network performance for Bayesian regularization training.
plotes - Plot an error surface of a single input neuron.
plotpc - Plot classification line on perceptron vector plot.
plotpv - Plot perceptron input/target vectors.
plotep - Plot a weight-bias position on an error surface.
plotperf - Plot network performance.
plotsom - Plot self-organizing map.
plotv - Plot vectors as lines from the origin.
plotvec - Plot vectors with different colors.
Pre and Post Processing.
prestd - Normalize data for unity standard deviation and zero mean.
poststd - Unnormalize data which has been normalized by PRESTD.
trastd - Transform data with precalculated mean and standard deviation.
premnmx - Normalize data for maximum of 1 and minimum of -1.
postmnmx - Unnormalize data which has been normalized by PREMNMX.
tramnmx - Transform data with precalculated minimum and maximum.
prepca - Principal component analysis on input data.
trapca - Transform data with PCA matrix computed by PREPCA.
postreg - Post-training regression analysis.
Simulink support.
gensim - Generate a SIMULINK block to simulate a neural network.
Topology functions.
gridtop - Grid layer topology function.
hextop - Hexagonal layer topology function.
randtop - Random layer topology function.
Training functions.
trainbfg - BFGS quasi-Newton backpropagation.
trainbr - Bayesian regularization.
traincgb - Powell-Beale conjugate gradient backpropagation.
traincgf - Fletcher-Powell conjugate gradient backpropagation.
traincgp - Polak-Ribiere conjugate gradient backpropagation.
traingd - Gradient descent backpropagation.
traingdm - Gradient descent with momentum backpropagation.
traingda - Gradient descent with adaptive lr backpropagation.
traingdx - Gradient descent w/momentum & adaptive lr backpropagation.
trainlm - Levenberg-Marquardt backpropagation.
trainoss - One step secant backpropagation.
trainrp - Resilient backpropagation (Rprop).
trainscg - Scaled conjugate gradient backpropagation.
trainwb - By-weight-and-bias network training function.
trainwb1 - By-weight-&-bias 1-vector-at-a-time training function.
Transfer functions.
compet - Competitive transfer function.
hardlim - Hard limit transfer function.
hardlims - Symmetric hard limit transfer function.
logsig - Log sigmoid transfer function.
poslin - Positive linear transfer function.
purelin - Linear transfer function.
radbas - Radial basis transfer function.
satlin - Saturating linear transfer function.
satlins - Symmetric saturating linear transfer function.
softmax - Soft max transfer function.
tansig - Hyperbolic tangent sigmoid transfer function.
tribas - Triangular basis transfer function.
Transfer derivative functions.
dhardlim - Hard limit transfer derivative function.
dhardlms - Symmetric hard limit transfer derivative function.
dlogsig - Log sigmoid transfer derivative function.
dposlin - Positive linear transfer derivative function.
dpurelin - Hard limit transfer derivative function.
dradbas - Radial basis transfer derivative function.
dsatlin - Saturating linear transfer derivative function.
dsatlins - Symmetric saturating linear transfer derivative function.
dtansig - Hyperbolic tangent sigmoid transfer derivative function.
dtribas - Triangular basis transfer derivative function.
Update networks from previous versions.
nnt2c - Update NNT 2.0 competitive layer to NNT 3.0.
nnt2elm - Update NNT 2.0 Elman backpropagation network to NNT 3.0.
nnt2ff - Update NNT 2.0 feed-forward network to NNT 3.0.
nnt2hop - Update NNT 2.0 Hopfield recurrent network to NNT 3.0.
nnt2lin - Update NNT 2.0 linear layer to NNT 3.0.
nnt2lvq - Update NNT 2.0 learning vector quantization network to NNT 3.0.
nnt2p - Update NNT 2.0 perceptron to NNT 3.0.
nnt2rb - Update NNT 2.0 radial basis network to NNT 3.0.
nnt2som - Update NNT 2.0 self-organizing map to NNT 3.0.
Using networks.
sim - Simulate a neural network.
init - Initialize a neural network.
adapt - Allow a neural network to adapt.
train - Train a neural network.
disp - Display a neural network's properties.
display - Display the name and properties of a neural network variable.
Vectors.
cell2mat - Combine cell array of matrices into one matrix.
concur - Create concurrent bias vectors.
con2seq - Convert concurrent vectors to sequential vectors.
combvec - Create all combinations of vectors.
ind2vec - Convert indices to vectors.
mat2cell - Break matrix up into cell array of matrices.
minmax - Ranges of matrix rows.
nncopy - Copy matrix or cell array.
normc - Normalize columns of a matrix.
normr - Normalize rows of a matrix.
pnormc - Pseudo-normalize columns of a matrix.
quant - Discretize values as multiples of a quantity.
seq2con - Convert sequential vectors to concurrent vectors.
sumsqr - Sum squared elements of matrix.
vec2ind - Convert vectors to indices.
Weight functions.
dist - Euclidean distance weight function.
dotprod - Dot product weight function.
mandist - Manhattan distance weight function.
negdist - Dot product weight function.
normprod - Normalized dot product weight function.
Weight and bias initialization functions.
initcon - Conscience bias initialization function.
initzero - Zero weight/bias initialization function.
midpoint - Midpoint weight initialization function.
randnc - Normalized column weight initialization function.
randnr - Normalized row weight initialization function.
rands - Symmetric random weight/bias initialization function.
Weight derivative functions.
ddotprod - Dot product weight derivative function.
For functions listed by network type:
assoclr - Associative learning rules.
backprop - Backpropagation networks.
elman - Elman recurrent networks.
hopfield - Hopfield recurrent networks.
linnet - Linear networks.
lvq - Learning vector quantization.
percept - Perceptrons.
radbasis - Radial basis networks.
selforg - Self-organizing networks.
%
% Let's get help on how to set up
% a Feed Forward Network
%
>> help newff
NEWFF Create a feed-forward backpropagation network.
Syntax
net = newff(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF)
Description
NEWFF(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF) takes,
PR - Rx2 matrix of min and max values for R input elements.
Si - Size of ith layer, for Nl layers.
TFi - Transfer function of ith layer, default = 'tansig'.
BTF - Backprop network training function, default = 'trainlm'.
BLF - Backprop weight/bias learning function, default = 'learngdm'.
PF - Performance function, default = 'mse'.
and returns an N layer feed-forward backprop network.
The transfer functions TFi can be any differentiable transfer
function such as TANSIG, LOGSIG, or PURELIN.
The training function BTF can be any of the backprop training
functions such as TRAINLM, TRAINBFG, TRAINRP, TRAINGD, etc.
*WARNING*: TRAINLM is the default training function because it
is very fast, but it requires a lot of memory to run. If you get
an "out-of-memory" error when training try doing one of these:
(1) Slow TRAINLM training, but reduce memory requirements, by
setting NET.trainParam.mem_reduc to 2 or more. (See HELP TRAINLM.)
(2) Use TRAINBFG, which is slower but more memory efficient than TRAINLM.
(3) Use TRAINRP which is slower but more memory efficient than TRAINBFG.
The learning function BLF can be either of the backpropagation
learning functions such as LEARNGD, or LEARNGDM.
The performance function can be any of the differentiable performance
functions such as MSE or MSEREG.
Examples
Here is a problem consisting of inputs P and targets T that we would
like to solve with a network.
P = [0 1 2 3 4 5 6 7 8 9 10];
T = [0 1 2 3 4 3 2 1 2 3 4];
Here a two-layer feed-forward network is created. The network's
input ranges from [0 to 10]. The first layer has five TANSIG
neurons, the second layer has one PURELIN neuron. The TRAINLM
network training function is to be used.
net = newff([0 10],[5 1],{'tansig' 'purelin'});
Here the network is simulated and its output plotted against
the targets.
Y = sim(net,P);
plot(P,T,P,Y,'o')
Here the network is trained for 50 epochs. Again the network's
output is plotted.
net.trainParam.epochs = 50;
net = train(net,P,T);
Y = sim(net,P);
plot(P,T,P,Y,'o')
Algorithm
Feed-forward networks consist of Nl layers using the DOTPROD
weight function, NETSUM net input function, and the specified
transfer functions.
The first layer has weights coming from the input. Each subsequent
layer has a weight coming from the previous layer. All layers
have biases. The last layer is the network output.
Each layer's weights and biases are initialized with INITNW.
Adaption is done with ADAPTWB which updates weights with the
specified learning function. Training is done with the specified
training function. Performance is measured according to the specified
performance function.
See also NEWCF, NEWELM, SIM, INIT, ADAPT, TRAIN
%
% Let's set up a Neural Network to classify patterns
%
% Our training data is in the file 90trna.cop
% and consists of 99 I/O pairs. Input is
% a vector of size 32 and output is one of 11
% binary classifications so it is a vector of
% size eleven.
%
% We will use a 32 x 7 x 11 FFN
%
%
% This is what we need to set up
%
%net = newff(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF)
% Description
%
% NEWFF(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF) takes,
% PR - Rx2 matrix of min and max values for R input elements.
% Si - Size of ith layer, for Nl layers.
% TFi - Transfer function of ith layer, default = 'tansig'.
% BTF - Backprop network training function, default = 'trainlm'.
% BLF - Backprop weight/bias learning function, default = 'learngdm'.
% PF - Performance function, default = 'mse'.
and returns an N layer feed-forward backprop network.
% We need to read in the training data
% Let's do this: use the function ReadIO:
function [A,B] = ReadIO(IOsize,Outsize,DataSize,name)
%
% Read in pattern I/O data:
% Iosize floats this is the pattern: here IOsize x 1
% Outsize floats this is the class of the pattern: here Outsize x 1
%
A = zeros(DataSize,IOsize);
B = zeros(DataSize,Outsize);
fid = fopen(name);
for count = 1:DataSize
U = fscanf(fid,'%f',IOsize);
A(count,1:IOsize) = transpose(U);
V = fscanf(fid,'%f',Outsize);
B(count,1:Outsize) = transpose(V);
end;
fclose(fid);
%
% Here we go:
%
>> [A,B] = ReadIO(32,11,99,'90trna.cop');
%
% So A stores the Input patterns and B the output patterns
%
%
% we need the min and max values for each input
%
>> help min
MIN Smallest component.
For vectors, MIN(X) is the smallest element in X. For matrices,
MIN(X) is a row vector containing the minimum element from each
column. For N-D arrays, MIN(X) operates along the first
non-singleton dimension.
[Y,I] = MIN(X) returns the indices of the minimum values in vector I.
If the values along the first non-singleton dimension contain more
than one minimal element, the index of the first one is returned.
MIN(X,Y) returns an array the same size as X and Y with the
smallest elements taken from X or Y. Either one can be a scalar.
[Y,I] = MIN(X,[],DIM) operates along the dimension DIM.
When complex, the magnitude MIN(ABS(X)) is used. NaN's are ignored
when computing the minimum.
Example: If X = [2 8 4 then min(X,[],1) is [2 3 4],
7 3 9]
min(X,[],2) is [2 and min(X,5) is [2 5 4
3], 5 3 5].
See also MAX, MEDIAN, MEAN, SORT.
%
% Our A is 99 x 32 so we want min and max of each
% col
%
>> Min = min(A);
>> Max = max(A);
>> size(Min)
ans =
1 32
>> Pr = zeros(32,2);
>> Pr(1:32,1) = Min';
>> Pr(1:32,2) = Max';
% We want S1 = 32, S2 = 7 S3 = 11
>> S1 = 32;
>> S2 = 7;
>> S3 = 11;
%
% TFi - Transfer function of ith layer, default = 'tansig'.
%
>> TF1 = 'tansig';
>> TF2 = 'tansig';
>> TF3 = 'tansig';
%
% set up BTF - Backprop network training function, default = 'trainlm'.
%
>> BTF = 'trainlm';
%
% BLF - Backprop weight/bias learning function, default = 'learngdm'
%
>> BLF = 'learngdm';
%
% To sum up:
%
%
% PF - Performance function, default = 'mse'.
%
>> PF = 'mse';
%
% Now we can set up the FFN
%
>> net = newff(Pr,[S1 S2 S3],{TF1 TF2 TF3},BTF,BLF,PF);
%
% To recap, we need set up the following
%
%net = newff(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF)
% Description
%
% NEWFF(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF) takes,
% PR - Rx2 matrix of min and max values for R input elements.
% Si - Size of ith layer, for Nl layers.
% TFi - Transfer function of ith layer, default = 'tansig'.
% BTF - Backprop network training function, default = 'trainlm'.
% BLF - Backprop weight/bias learning function, default = 'learngdm'.
% PF - Performance function, default = 'mse'.
and returns an N layer feed-forward backprop network.