next up previous contents
Next: Conclusions and Future Work Up: State of the Art Previous: Specific Data Mining Applications

Rough Sets Applications

The Rough Sets framework has been used with success in several data mining applications. Here we mention some of these:
RSES: The RSES system [Syn] was developed in Poland and uses state of the art techniques from the Rough Sets theory. The system is available for Hewlett Packard work stations with or without a graphical interface. A maximum of 30.000 objects with 16.000 attributes may be processed for rule generation. The number of attributes poses few restraints, but the relatively low number of maximal objects prevents it from being used on large data sets.

Newer versions of RSES will have the possibility of scaling attribute values. This means that real valued attributes may be quantified into a number of partitions. It is also possible to join multiple attributes and represent them as one. This leads to a faster calculation of reducts.

In order to test generated rules, it is possible to split the database in two parts. The rule generation may then compute rules with basis on the first half, while testing of the classification ability of these rules is done on the other.

The system supports script programming in order to ease step-by-step experimentation. Further, is it possible to choose between different computing algorithms for finding minimal rules. This comes in handy when the most thorough algorithm would take approximately 100 years for large datasets.
DataLogic/R is a database ``mining'' from Reduct Systems Inc. The software is based on theories of knowledge representation, inductive logic and rough sets.gif According to Reduct Systems, their software is unique in that it analyzes logical patterns in data at different levels of knowledge representation. This means that it can discover facts and relationships not accessible with any other method, still according to Reduct Systems.

The system is written in C to be easily portable. A packaged version is available for PC, which works on a maximum number of 2000 attributes per object. It has been taken into commercial use within finance. Wall Street analyst Murray Riggiero Jr. used DataLogic in conjunction with neural-network software to generate rules for his trading system.
KDD-R: This system is described in [ZS94], and is based on the Variable Precision Rough Set (VPRS) model. It is implemented in C under UNIX. The system features several units, of which the following are mentioned here.

In the present form, the system also has a limited capability to handle incomplete data and some incremental update capability.
LERS: Learning from Examples based on Rough Sets, is another system created for rule induction. The system handles inconsistencies in data sets by following the principles of Rough Sets. These inconsistencies are not corrected, but instead the upper and lower approximation for each concept is calculated. Thereafter deterministic and indeterministic rules are generated.

The operator of the system can choose whether the system shall operate using methods for machine learning or knowledge acquisition. In the first case one single minimal description is calculated which distinguishes each concept from the others. In the second case all rules on minimal form which can be derived from the given dataset is calculated. In both cases the user has the choice between using the local or global approximation.
Other Systems: In addition to the aforementioned Rough Sets based systems, a number of other systems based on the methodologies of Rough Sets were built in the past. Best known are the systems RoughDAS (Slowinski and Stefanowski, 1992), RSL (Gawrys and Sienkiewicz, 1994) and GRG [SHC](Shan, Hamilton and Cercone, 1995).


next up previous contents
Next: Conclusions and Future Work Up: State of the Art Previous: Specific Data Mining Applications

Helge Grenager Solheim
Sat May 4 03:30:02 MET DST 1996