|
The Trainer block can train any neural network in NeurOn-Line. Attach one of its
action links to a neural network, and attach the other action link to a data set. You
cannot attach more than one neural network or data set to the block.
configure from the Trainer block's menu. The configuration
panel contains different options depending on what type of neural network is
connected to it. For more information, see "Configuring".
To evaluate the Trainer block, either pass it a control signal or choose
evaluate
from its menu. The Trainer trains the neural network with the data from the data
set by adjusting the network's weights and other internal parameters. The Trainer
Block does not modify the network's architecture or modify any data in the data
set. You may evaluate either the attached neural network or data set during
training. If you evaluate the neural network during training, the output is based
on the weights before the training started.
cp-out
and a scalar value on dp-out. The scalar value tells you how well it trained. If you
are training a Backpropagation, Autoassociative, or Radial Basis Function
Network, it passes the square root of the mean of the squared errors for all the
training data. If you are training a Rho Network, it passes the negative mean of
the logarithms of the predicted probabilities for all training data. In both cases,
lower numbers mean a closer fit of the training data.
Watching the Training Happen
Sometimes you may want to watch the Trainer's progress, especially for long
training runs. When you select Enable RPC from the Remote Procedure menu in
the Print Configuration Tools menu bar, the Trainer displays an output of its
progress. If you launched the NeurOn-Line remote procedure yourself, the table
appears in the window where you launched it. If you let NeurOn-Line launch the
remote procedure automatically, the table appears in the window where you
launched G2.
GRADIENTS FUNCTIONS OBJECTIVE METHOD
0 1 49.87
1 4 3.65 steepest descent
2 7 3.56 Broyden
3 11 3.445 Broyden
4 15 3.239 Broyden
6 18 2.81 steepest descent
7 23 2.402 Broyden
8 26 1.54 Broyden
9 30 1.041 Broyden
10 33 0.9416 Broyden
11 36 0.8692 Broyden
12 39 0.8096 Broyden
Maximum number of function calls exceeded
The second column is how many times the Trainer needed to evaluate the objective function in that pass. Each evaluation of the objective function represents one "step," and the total is bounded by the maximum iterations (in this case 40). Typically, the trainer takes 3 to 5 steps in a given direction before it calculates a new gradient.
The third column is the value of the least-squares objective function. The lower the value, the closer the Trainer is to a good fit. Typically, when the training begins, the objective decreases greatly with each pass (or gradient). As the training comes to an end, the object decreases much more slowly.
The last column is the training method used on this pass (or gradient). This is usually the method you specified in the configuration panel. However, the trainer may use steepest descent from time to time to accelerate the training.
configure menu option will then bring up a configuration panel
appropriate to the type of network to be trained.
This is the configuration panel you see when you are training a Backpropagation or Autoassociative Network.
|
The following headings describe how to configure the block.
Choosing the Maximum Number of Training Iterations
The Trainer improves the weights in a number of individual steps. To specify the
maximum number of steps, enter a number in the Maximum Iterations attribute.
Typically, values range from 50 to as much as 1000. However, you will usually
want to use several short training runs so you can monitor the Trainer's progress.
When you evaluate the Trainer again, it continues the training from where it
stopped.
Choosing the Training Method
To choose how to train the network, select one of the options below Training
Method. The options fall into two main categories:
Once you choose which category of training methods to use, you may want to experiment with the different methods in each category to find out which is best for your network.
yes if your input data has columns that might be correlated.
To speed up training, the block projects your input data vector to a vector with fewer dimensions, trains with the smaller vector, and projects the smaller vector and its training results backwards to obtain results useful for the original vector. In general, this option is recommended if you have more than ten inputs.
|
regular k-means clustering.
class-separate k-means clustering.
During training, NOL finds and outputs to the background window the cluster centers as shown in this matrix. Each row represents the coordinates of a cluster center. Up to five centers are shown. Each column represents the coordinates for a dimension.
Matrix In cluster, now m = size = 14 by 10
1 2 3 4 5 6 7 8If the Unit Overlap attribute is
------------------------------------------------------------------------
1 |0.565 0.738 0.374 0.38 0.415 0.257 0.372 0.345
2 |0.536 0.532 0.536 0.83 0.509 0.482 0.275 0.356
3 |0.603 0.687 0.328 0.243 0.394 0.287 0.604
4 |0.573 0.446 0.337 0.363 0.749 0.656 0.437 0.777
5 |0.675 0.774 0.391 0.328 0.3 0.419 0.685 0.399
automatic, NOL then determines the optimum unit
widths by searching for the value of p that minimizes the training error, with the
unit centers previously determined by k-means clustering during the first stage of
the training, as shown in the following output:
Optimizing overlap parameter
Trying p = 1.000000, objective = 19.973553
Trying p = 2.000000, objective = 18.852538
Trying p = 3.618034, objective = 17.939625
Trying p = 4.135448, objective = 17.873745
Trying p = 4.972642, objective = 18.574853
Trying p = 4.455228, objective = 17.979710
Trying p = 3.937814, objective = 17.871113
Trying p = 4.025800, objective = 17.867350
Trying p = 3.985542, objective = 17.868146
Trying p = 4.066058, objective = 17.868195
Optimized overlap with p = 4.025800
Final objective function value = 17.665130
|
Which option you choose depends on what kind of problem you are trying to
solve:
treat data as single class. The Trainer ignores any output values in the data set. When you later evaluate the Rho Network, its output will be the probability that the input data is part of the distribution defined by the training set.
treat output data as class label. The number of output values in the data set corresponds to the number of possible classes. In each data pair, the output that corresponds to the element's class should be one, and the rest of the outputs should be zero. When you later evaluate the Rho Network, its output will be a vector with one element for each class, and each element will be the probability that the input belongs to that class.
evaluate from the Training
block's menu.
|
See Also
For more information on how to use this block, see the pages below.
| Click here for more information... |
|---|
|
Basic Block Behavior
|
|
"Neural Network Blocks" Chapter
|
|
Data Set
|
|
Fit Tester
|
|
Train and Test
|
|
Five Fold CV
|
| Prev
| Next | Start of Chapter | End of Chapter | Contents | Index | (3 out of 7)
Copyright © 1996, Gensym Corporation, Inc.