next up previous contents
Next: Data From SCDS Up: The Experiments Previous: The Experiments

Data From UCI Repository

The following are used to describe properties of the data sets in use:

Shorthand Explanation
rel. attr. Number of attributes of relevance for class
irrel. attr. Number of attributes not relevant for class
% max Percentage of objects in most common class
training Number of tuples in training set
test Number of tuples in test set

The data sets used have these characteristics:

Data set rel. attr. irrel. attr. % max training test
Hayes-Roth & Hayes-Roth Database 3 2 38.6 132 28
Postoperative Patient Data 8 0 71.1 45 45
Tic-Tac-Toe Endgame database 9 0 65.3 47 911

Rule Generation
When generating reducts from RSES and RGEN, the following options were set:

For RGEN, two additional parameters were used:

Classification Results
We used simple voting as the classification strategy for RSES. The reason for this is that simple voting usually is best when nominal attribute values are used, which was the case. (Refer to [Syn] for different strategies.) For RGEN, different classification methods were used, as shown in the overview of methods in Table gif. The results of the tests are given in Table gif.

 

1 RSES - Simple voting
2 RGEN - Simple voting, use only top node of lattice (simulating RSES)
3 RGEN - Highest accuracy, all nodes
4 RGEN - Simple voting with linear weight, all nodes
5 RGEN - Simple voting with exponential weight correction of accuracy
6 RGEN - Simple voting with squared weight correction
7 RGEN - Measurement of evidence method
Table: Methods used for classification of data from UCI

 

 

Data set 1 2 3 4 5 6 7
Hayes-Roth 85 82.1 92.9 89.3 89.3 89.3 89.3
Postoperative 57 64.4 64.4 66.7 66.7 66.7 55.6
Tic-Tac-Toe 71.1 73.3 65.4 72.0 72.3 71.9 70.1
Table: Results of classification of UCI data sets

 


next up previous contents
Next: Data From SCDS Up: The Experiments Previous: The Experiments

Helge Grenager Solheim
Sat May 4 03:30:02 MET DST 1996