The algorithm for default rules which will be considered, systematically uses subsets of the condition attributes in the information system when generating rules. Each pass uses one subset, and generates rules based on this. Now, for an information system, where C is the set of condition attributes, there will here be different subsets. These subsets will be called projections. For instance, for an information system with condition attributes , the different projections are , , , , , , and . If the information system is of reasonable size, calculation of rules using the attribute values of the projected attributes for the objects from just one of these projections is likely to be a heavy task. As shown in [MS96], many of these projections will lead to the same set of rules. The main idea behind the algorithm is a method for selecting which of these projections will lead to new sets of rules.
For each projection, the algorithm calls a rule generating algorithm, which takes as input the set of projected attributes. The rule generating algorithm also takes as input the condition attributes not in the projection, and uses these to produce possible blocks which applies to the generated rules.
The whole set of possible projections is a lattice, while one projection is a node in this lattice. The set of projected attributes will be denoted , while the rest of the condition attributes, the ones ``cut away'', will be denoted .
The top level node, the one with all condition attributes included, is first submitted in a call to the rule generating procedure. Then, to find nodes of interest further down the lattice, the algorithm uses the fact that if the attribute(s) found in one conjunct of the discernibility matrix is removed from the matrix, new indeterminism is created. For instance, if a discernibility matrix consists of the two elements (a) and (b,c), either the attribute a, or both the attributes b and c can be removed from the matrix in order to join classes. When generating rules based on this reduced discernibility matrix, a new set of rules will result, also shown in [MS96].
This was the main idea. A simplified version is shown in
Figures
and
. The
name
defaultRules,
indicating generation of default rules, is the main
function. The function defaultRuleGen in the algorithm, recursively
calls itself.
Figure: The defaultRules function
Figure: The defaultRuleGen function
If one is not interested in rules with a validity less than a certain number, a threshold value may be set. This is not shown in the pseudo code of the algorithm, however. Also not shown are methods for avoiding to visit the same node twice.