The data mining process process usually starts with a collection and preprocessing of information, thereafter storing it in a kind of database. A more advanced procedure for storing the processed data is to use advanced knowledge representations which logically describes the content of the database. Sources of data are typically transaction files or databases of different sorts.
Data mining tools search for patterns in the data. The search may be done automatically by the system or interactively with an analyst making queries (a top-down search for testing hypotheses). A wide range of data mining tools - such as neural networks, rule based systems, case based analysis, machine learning and statistical applications - alone or in a combination, may be applied to a problem. Applications based on Rough Sets falls in the categories of rule based systems, case based analysis and machine learning.
The search process is often interactive, in the sense that the
analyst makes queries, regards the results, and based on these makes new
queries. Once this process is completed, the data mining system
generates the findings. It it then a human task (normally - today)
to take actions based on these results. Figure
shows steps
in the process from raw data to new, assimilated information.
Figure: Different steps in the data mining process.