A deductive system reasons about data by using a pre-defined set
of rules. These rules limit how the information given to the deductive
system may be used to draw conclusions and infer information.
A correct deductive system is inferring information which is a logical
consequence of the database contents.
The only kind of information one will receive with deduction is what explicitly has been inserted into the database in one or more tables. Is this really learning? By arguing that implicit and hidden information will be found and presented to the user explicitly, the answer to this is yes. It may for instance be inferred that John is Mary's husband by joining a table containing persons with a table containing wedding information.
If we want more general information, or information of a higher order,
we could use another learning method called induction. Induction could
be said to be the creation of a simplified model of the information in the
environment subject to study. This environment corresponds to the training
set, and is searched for regularities. In the learning phase, the system
observes the environment, and creates a model of what is being observed. This
simplified model is used to create general rules. The rules may be represented
in several different forms, as will be described Section
.
An example of a general rule an inductive system may generate, is that all employees in a firm get paid each month. This contrasts the rules achieved from deductive learning where only specific rules following logically from the database contents would be created. (We could get the result that John is an employee and gets paid and that Mary is an employee and gets paid and so on, but nothing concerning persons not in the database.) When faced with a person in the test set which is not in the training set, by using the rules from induction one could draw the conclusion that he gets paid every month, but the deductive rules would not be applicable.
For the purpose of data mining, inductive learning is the most appropriate, and indeed the most used. Reasoning in such a manner will generate knowledge which hopefully is valid not only for the objects in the database, but also for similar, but yet unseen objects. By using supervised or unsupervised learning, the system is told what kind of patterns to focus on. This is discussed next.