next up previous contents
Next: Definitions Up: No Title Previous: Rough Membership Function

An Algorithm for Default Rules Generation

 

As pointed out in Section gif, there are many problems related to noisy data and limited information in the data. It would be preferable to have rules which may serve as defaults without having to be correct for every specific case. In conjunction with this, an algorithm creating both exact and default rules has been made. The algorithm is based upon the Rough Sets theory, and uses supervised learning. It was first described by T. Mollestad in [Mol95], and later in [MS96] and [Mol96]. An implementation of the algorithm was done by Jon Petter Hjulstad in conjunction with his diploma thesis [Hju96], and is described there.

The goal of the algorithm is to be able to find important dependencies in the data set, even when there is a degree of inconsistency and vagueness in it. This situation is common, so being able to handle it properly is very useful. Inconsistencies and vagueness may result from for instance measuring errors and typing errors. Cases where some attribute values are missing for certain objects can also lead to vagueness. Yet another source to vagueness is when the objects does not cover all dependency rules inherent to the area of interest. It may also happen that attributes relevant to the decision is not present in the data set at hand.

Due to the problems of inconsistencies and vagueness in the data, deterministic rules following the data set at hand may be very specific for that data set. Specific in the sense that condition attributes which may be irrelevant for the decision class may become part of the rules. Also, when trying to classify a data set of similar cases, there may be many objects which is not covered by any of the generated rules.

The algorithm mentioned above is helpful when trying to solve these problems. Its output is a much more complete set of rules than just the deterministic ones. In addition to the deterministic rules, rules with shorter conditional parts (thereby making them less restrictive) results. Along with each rule comes a validity measure, which gives an indication of the strength of the rule. Using these rules, it is more likely to find one that can be applied when attribute values from only a selected subset of attributes is known or when previously unseen attribute values appear.




next up previous contents
Next: Definitions Up: No Title Previous: Rough Membership Function

Helge Grenager Solheim
Sat May 4 03:30:02 MET DST 1996