Professional Documents
Culture Documents
Course Material
Introduction
Evolution of Database Technology
Data Mining- Motivation
Data Mining?
Knowledge Discovery (KDD) Process
7 Data Mining Steps
Data Mining and Business Intelligence
Architecture of Data Mining system
DM Functionalities - A Multi-Dimensional View of Data Mining
Kinds of Data to be Mined
Kinds of Patterns to be Mined
Technologies
Applications Targeted
Evaluation of Knowledge
Major Issues in Data Mining
Evolution of Database Technology
1960s:
o Data collection, database creation, IMS and network DBMS
1970s:
o Relational data model, relational DBMS implementation
1980s:
o RDBMS, advanced data models (extended-relational, OO, deductive,
etc.)
o Application-oriented DBMS (spatial, scientific, engineering, etc.)
1990s:
UnitI DWDM_SMA
7 Data Mining Steps
Data cleaning remove noise and inconsistent data
Data integration combine multiple sources
Data selection retrieve from the database data relevant to the analysis task
Data transformation data are transformed or consolidated into forms
appropriate for mining (e.g. performing summary or aggregation operations)
Data mining intelligent methods are applied to extract data patterns
Pattern evaluation identify truly interesting patterns representing
knowledge based on some interestingness measures
Knowledge presentation present mined knowledge to the user
Data Mining and Business Intelligence
UnitI DWDM_SMA
UnitI DWDM_SMA
Data Mining: Classification Schemes
General functionality
o Descriptive data mining
o Predictive data mining
Different views lead to different classifications
o Data view: Kinds of data to be mined
o Knowledge view: Kinds of knowledge to be discovered
o Method view: Kinds of techniques utilized
o Application view: Kinds of applications adapted
Data Mining: On What Kinds of Data?
Database-oriented data sets and applications
o Relational database,
o Data warehouse,
o Transactional database
o Object-relational databases,
o Heterogeneous databases and
o Legacy databases
Advanced data sets and advanced applications
o Data streams and sensor data
o Time-series data, temporal data, sequence data (incl. bio-sequences)
o Structure data, graphs, social networks and information networks
o Spatial data and spatiotemporal data
o Multimedia database
o Text databases
o The World-Wide Web
Data Mining Functionalities
Multidimensional concept description: Characterization and discrimination
o Generalize, summarize, and contrast data characteristics, e.g., dry vs.
wet regions
Frequent patterns, association, correlation vs. causality
UnitI DWDM_SMA
10
11