| Prev | Next | Start of Chapter | End of Chapter | Contents | Index | (9 out of 18)

Data Set


A Data Set stores data pairs for training and testing neural networks.

A Data Set contains three matrices: input, target, and predictions. Whenever the Data Set receives a data pair, it adds the data pair's X vector to the end of the input matrix and the data pair's Y vector to the end of the target matrix. When a Fit Tester tests a neural network with a Data Set, it fills the predictions matrix with the values that the network predicts for each element of the input matrix.

The number of columns in the input matrix is the same as the dimension of the largest X vector. The number of columns in the target and predictions matrices is the same as the dimension of the largest Y vector. If the Data Set receives a data pair with an X or Y vector that is smaller than the input or target matrix, the Data Set pads that vector with zeros. If the Data Set receives a data pair with an X or Y vector that is larger than the input or target matrix, the Data Set adds a column to the appropriate matrix and pads the previous elements with zeros.

The dp-out of the Data Set is the number of data pairs.

A Data Set has no configurable attributes.

Editing the Data Set

To edit a data set, you must:

Setting the Dimensions of the Data Set

To set the dimensions of the data set, select the edit data set. . . menu choice on the Data Set block. When you first edit a Data Set that contains no data, NeurOn-Line displays this dialog for entering the Number of Samples, the Number of Inputs, and the Number of Targets:


Enter values for each of these attributes, and click the OK button to display the spreadsheet for editing the data set.

If your data set already contains data and you select the edit data set. . . menu choice, NeurOn-Line does not display this dialog. Instead, NeurOn-Line displays the spreadsheet directly.

You can edit the dimensions of the data from the spreadsheet by selecting this button in the spreadsheet:

Selecting this buttons displays the Edit Data Set Dimensions dialog for you to edit the dimensions of the existing data.

Entering and Viewing Data

To edit the contents of a Data Set that is initially empty, click OK in the Enter Data Set Dimensions dialog displayed above. NeurOn-Line displays a spreadsheet for editing the inputs and targets of the data set, and for viewing the predictions, timestamps, and quality.

To view or edit the contents of a Data Set that already contains data, simply select the edit data set. . . menu choice. NeurOn-Line displays the spreadsheet directly.

Here is a spreadsheet for a data set with four inputs and three targets:


The samples are numbered down the left side of the editor. The editor shows samples 1 through 8. To see the other samples, use the vertical scroll bar. The data is split into four sections labeled Timestamps, Quality, Inputs, and Outputs. If there is more than one input or output in each sample, these sections can contain several columns, numbered 0, 1, and so on. The editor shows samples 1 through 3. To see the other samples, use the horizontal scroll bars.

You enter input and output data for the data set by:

For more information on how to use the spreadsheet, see "Using the GXL Spreadsheet to Edit Data".

Saving and Loading Data

You can save or load the complete data set or any part of the data set to or from a file.

To load a data set from a file, select the file operations. . . menu choice on the Data Set block to display this dialog:


Enter the name of the file from which to load the data, and select the Load from File button.

To save a data set to a file, select the file operations. . . menu choice, enter the filename, and select the Save to File button.

You can also load and save parts of the data set by first selecting the cells or rows in the spreadsheet and then using the spreadsheet buttons for loading and saving data.

For more information on how to use the spreadsheet, see "Using the GXL Spreadsheet to Edit Data".

Plotting Data

To create a chart of the Data Set's input target data or predictions, select the plot data. . . menu choice on the Data Set. NeurOn-Line displays this dialog:


For more information on how to use this dialog, see "Data Set Plot".

There is one difference between how the plot data. . . menu choice and the Data Set Plot block work. If the attribute Chart Name is set to G2, NeurOn-Line creates a workspace for your chart, like the following.


When creating plots from the Data Set configuration dialog, this subworkspace includes two buttons that are not included if you are configuring a Data Set Plot block. If you press Delete Plot, NeurOn-Line deletes the subworkspace and its chart. If you press Iconify, NeurOn-Line creates a Data Set Plot block with the configuration you specified, and attaches it to the Data Set.

Text Format for Data Sets

The text format for saving and loading data sets from files consists of the following lines:

  1. The version number. For this version of NeurOn-Line, it is 1.

  2. The number of data pairs in the Data Set.

  3. The number of elements in each input vector.

  4. The number of elements in each target vector.

  5. Several lines of data, one line for each data pair in the Data Set. Each line contains the follow items, separated with commas:

    1. The number of the data pair, numbered consecutively starting with 0.

    2. The time stamp for the data pair. It can be either a float or an integer.

    3. The quality of the data pair. It can be either OK, manual, or no-value.

    4. The input and target values of the data pair, starting with the input values.

Optionally, a line can contain a comment, which begins with a semicolon and continues to the end of the line.

Here is an example of a Data Set stored as text.

1; Version of this save/restore protocol for data sets
4 ; Number of samples in this data-set
2 ; Length of each input data vector
1 ; Length of each output data vector
0, 9516, OK, 0.000000000,0.000000000, 0.000000000
1, 9520, OK, 0.000000000,1.000000000, 1.000000000
2, 9524, OK, 1.000000000,0.000000000, 1.000000000
3, 9528, OK, 1.000000000,1.000000000, 0.000000000

Customizing the Text Format

By writing your own G2 procedures, you can customize the file format associated with a data set.


Note: For more information on NeurOn-Line's application programmers' interface (API), see Chapter 5, "API Procedures".
In the data set block's attribute table, set the attributes File-save-procedure and File-load-procedure to the names of the procedures that read and write using your format. Your file save and load procedures must save and load the following attributes of a data set:

Use the API procedure nol-configure-data-set to resize the elements of a data set. The procedure g2-get-matrix-dimensions tells you the current dimensions of the input Data Set and output Data Set matrices. These API procedures are provided for saving and loading parts of data sets:

Clearing the Data Set

To clear the data set, select the clear data set menu choice. NeurOn-Line displays this dialog:


Click Yes to clear the data set. Click No to keep the data set, unchanged.

Making Values Permanent

When you choose make permanent from the Data Set's menu, it saves all the Data Set's current values.

Configuring

A Data Set has no configurable attributes.

See Also

For more information on how to use this block, see the pages below.

Click here for more information...
Basic Block Behavior
Saving a Block's Data After Resetting G2
Chapter 8, "Training Blocks"

| Prev | Next | Start of Chapter | End of Chapter | Contents | Index | (9 out of 18)

Copyright © 1996, Gensym Corporation, Inc.