You are on page 1of 17

DBMS-2 B.

Com (CA ,4th


sem)
Unit-2 (Notes)

Decision support system (DSS)


A Decision Support System (DSS) is a computer-based information system that
supports business or organizational decision-making activities. DSSs serve the
management, operations, and planning levels of an organization (usually mid and
higher management) and help to make decisions, which may be rapidly changing and
not easily specified in advance (Unstructured and Semi-Structured decision problems).
Decision support systems can be either fully computerized, human or a combination of
both.
While academics have perceived DSS as a tool to support decision making process,
DSS users see DSS as a tool to facilitate organizational processes. [1] Some authors
have extended the definition of DSS to include any system that might support decision
making.[2] Sprague (1980) defines DSS by its characteristics:
1. DSS tends to be aimed at the less well structured, underspecified problem that
upper level managerstypically face;
2. DSS attempts to combine the use of models or analytic techniques with
traditional data access and retrieval functions;
3. DSS specifically focuses on features which make them easy to use by
noncomputer people in an interactive mode; and
4. DSS emphasizes flexibility and adaptability to accommodate changes in
the environment and the decision making approach of the user.

Characterstics of DSS
The following is my list of the characteristics of a DSS.

1.

Facilitation. DSS facilitate and support specific decision-making activities


and/or decision processes.

2.

Interaction. DSS are computer-based systems designed for interactive use by


decision makers or staff users who control the sequence of interaction and the
operations performed.

3.

Ancillary. DSS can support decision makers at any level in an organization. They
are NOT intended to replace decision makers.

4.

Repeated Use. DSS are intended for repeated use. A specific DSS may be used
routinely or used as needed for ad hoc decision support tasks.

5.

Task-oriented. DSS provide specific capabilities that support one or more tasks
related to decision-making, including: intelligence and data analysis; identification
and design of alternatives; choice among alternatives; and decision implementation.

6.

Identifiable. DSS may be independent systems that collect or replicate data


from other information systems OR subsystems of a larger, more integrated
information system.

7.

Decision Impact. DSS are intended to improve the accuracy, timeliness, quality
and overall effectiveness of a specific decision or a set of related decisions.

Benefits of DSS
(1)
Time savings. For all categories of decision support systems,
research has demonstrated and substantiated reduced decision cycle
time, increased employee productivity and more timely information for
decision making. The time savings that have been documented from
using computerized decision support are often substantial.
Researchers, however, have not always demonstrated that decision
quality remained the same or actually improved.
(2) Enhance effectiveness. A second category of advantage that has
been widely discussed and examined is improved decision making
effectiveness and better decisions. Decision quality and decision
making effectiveness are however hard to document and measure.

Most researches have examined soft measures like perceived decision


quality rather than objective measures. Advocates of building data
warehouses identify the possibility of more and better analysis that
can improve decision making.
(3) Improve interpersonal communication. DSS can improve
communication and collaboration among decision makers. In
appropriate circumstances, communications- driven and group DSS
have had this impact. Model-driven DSS provides a means for sharing
facts and assumptions. Data-driven DSS make "one version of the
truth" about company operations available to managers and hence can
encourage fact-based decision making. Improved data accessibility is
often a major motivation for building a data-driven DSS. This
advantage has not been adequately demonstrated for most types of
DSS.
(4) Competitive advantage. Vendors frequently cite this advantage
for business intelligence systems, performance management systems,
and web-based DSS. Although it is possible to gain a competitive
advantage from computerized decision support, this is not a likely
outcome. Vendors routinely sell the same product to competitors and
even help with the installation. Organizations are most likely to gain
this advantage from novel, high risk, enterprise-wide, inward facing
decision support systems. Measuring this is and will continue to be
difficult.
(5) Cost reduction. Some researches and especially case studies
have documented DSS cost saving from labor savings in making
decisions and from lower infrastructure or technology costs. This is not
always a goal of building DSS.
(6) Increase decision maker satisfaction. The novelty of using
computers has and may continue to confound analysis of this
outcome. DSS may reduce frustrations of decision makers, create
perceptions that better information is being used and/or creates
perceptions that the individual is a "better" decision maker.
Satisfaction is a complex measure and researchers often measure
satisfaction with the DSS rather than satisfaction with using a DSS in
decision making. Some studies have compared satisfaction with and
without computerized decision aids. Those studies suggest the
complexity and "love/hate" tension of using computers for decision
support.

(7) Promote learning. Learning can occur as a by-product of initial


and ongoing use of a DSS. Two types of learning seem to occur:
learning of new concepts and the development of a better factual
understanding of the business and decision making environment.
Some DSS serve as "de facto" training tools for new employees. This
potential advantage has not been adequately examined.
(8) Increase organizational control. Data-driven DSS often make
business transaction data available for performance monitoring and ad
hoc querying. Such systems can enhance management understanding
of business operations and managers perceive that this is useful. What
is not always evident is the financial benefit from increasingly detailed
data.

Components of DSS
The main component of DSS is
1. Hardware
2. Software
1. Hardware: - Hardware is that parts of the computer system that can be touched.
These are tangible parts. Without hardware, software is nothing. Hardware is just like
human body and software is like soul in body. All input and output devices are hardware
parts. For example Mouse, Keyboard etc. are the parts of hardware.
There is no fixed hardware configuration for designing, developing, maintaining and
executing DSS. The hardware configuration for a DSS is mainly determined by:a) The size of the database
b) The DBMS package which one intends to use.
c) The type of model that are being used.
d) Ways in which reports/presentations are expected.
2. Software: - Software is a set of computer programs that are designed and develop to
perform a specific task. Software acts as a interface between the user and computer.
Software can be defined as a set of instructions written by a programme to solve a

problem. It can be classified as:a) Database Management Sub-System


b) Model Management Sub-system
c) Dialogue Management Sub-system
This is explained as below:a) Database Management Sub-system:- Normally there are two sources of data such
as internal source or external source. Database management system provides
facilities for organizing, storing and queering these data. It acts as an information
bank. DBMS software provides various facilities to modify and delete for database
creation, manipulate the data present in database, query the data in the database.
The architecture of a database management system includes External Schema,
Conceptual Schema, and Internal Schema.
b) Model Management Sub-system:- A model presents the relationship between
various parameters of the system. It gives a mathematical description of reality. The
model builder provides a structured framework for developing models by helping
decision makes. The model builder also contains model dictionary consistencies in the
definitions user of models.
A model management subsystem provides the following: 1. A model base management system which helps in the creation of models and
maintenance of the same.
2. An external interface which permits a user to choose a model to be executed
and provides facilities for entering data.
3. An interface to the database.
c) Dialogue Management Sub-system:- This acts as the gateway for the user to
communicate with the DSS. It provides menus and icons for the user to communicate
effectively with the system. It converts the queries given by the user into forms which
the other subsystems can recognize and execute. It keeps a track of activities that are
being performed.
The major activities of a Dialogue management subsystem are to:
1. Provides menus and icons for the user to communicate effectively with the
system.

2. Provide necessary on-line context sensitive help to various kinds of users.


3. Convert the queries given by the user into forms which the other subsystems
can recognize and execute.
4. Keep track of the activities

Operational Data vs DSS Data


Decision support systems and operational database systems are similar because
both use stored data, but the data is organized differently for the two types (Power,
2007, par 1). Powers (2007) also states that DSS data is data about business
occurrences and often a summarization of transaction. Operational data is a
detailed record of a companys daily business transactions.
The benefits of DSS and ODS are:
DSS
1. Improves personal efficiency
2. Speed up the process of decision making
3. Increases organizational control
4. Encourages exploration and discovery on the part of the decision maker
5. Speeds up problem solving in an organization
6. Facilitates interpersonal communication
7. Promotes learning or training
8. Generates new evidence in support of a decision
9. Creates a competitive advantage over competition
10. Reveals new approaches to thinking about the problem space
11. Helps automate managerial processes
ODS
1. Quick retrieval
2. the ability to share information across the company
3. The amount of data that can be stored that pertains to a business
4. provide simultaneous read/write requests through pre-defined queries
5. have the ability to flag specific information that may need to be retrieved on a
continuous basis

Data Warehouse

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile


collection of data in support of management's decision making process.
Subject-Oriented: A data warehouse can be used to analyze a particular subject
area. For example, "sales" can be a particular subject.
Integrated: A data warehouse integrates data from multiple data sources. For
example, source A and source B may have different ways of identifying a product,
but in a data warehouse, there will be only a single way of identifying a product.
Time-Variant: Historical data is kept in a data warehouse. For example, one can
retrieve data from 3 months, 6 months, 12 months, or even older data from a data
warehouse. This contrasts with a transactions system, where often only the most
recent data is kept. For example, a transaction system may hold the most recent
address of a customer, where a data warehouse can hold all addresses associated
with a customer.
Non-volatile: Once data is in the data warehouse, it will not change. So, historical
data in a data warehouse should never be altered.

Data Mining
Data Mining is an analytic process designed to explore data (usually large amounts of data typically business or market related - also known as "big data") in search of consistent
patterns and/or systematic relationships between variables, and then to validate the findings
by applying the detected patterns to new subsets of data. The ultimate goal of data mining is
prediction - and predictive data mining is the most common type of data mining and one
that has the most direct business applications. The process of data mining consists of three
stages: (1) the initial exploration, (2) model building or pattern identification
with validation/verification, and (3) deployment (i.e., the application of the model to new
data in order to generate predictions).
Stage 1: Exploration. This stage usually starts with data preparation which may involve
cleaning data, data transformations, selecting subsets of records and - in case of data sets

with large numbers of variables ("fields") - performing some preliminary feature


selection operations to bring the number of variables to a manageable range (depending on
the statistical methods which are being considered). Then, depending on the nature of the
analytic problem, this first stage of the process of data mining may involve anywhere
between a simple choice of straightforward predictors for a regression model, to elaborate
exploratory analyses using a wide variety of graphical and statistical methods
(seeExploratory Data Analysis (EDA)) in order to identify the most relevant variables and
determine the complexity and/or the general nature of models that can be taken into
account in the next stage.
Stage 2: Model building and validation. This stage involves considering various models and
choosing the best one based on their predictive performance (i.e., explaining the variability
in question and producing stable results across samples). This may sound like a simple
operation, but in fact, it sometimes involves a very elaborate process. There are a variety of
techniques developed to achieve that goal - many of which are based on so-called
"competitive evaluation of models," that is, applying different models to the same data set
and then comparing their performance to choose the best. These techniques - which are often
considered the core of predictive data mining - include: Bagging(Voting,
Averaging), Boosting, Stacking (Stacked Generalizations), and Meta-Learning.
Stage 3: Deployment. That final stage involves using the model selected as best in the
previous stage and applying it to new data in order to generate predictions or estimates of
the expected outcome.
The concept of Data Mining is becoming increasingly popular as a business information
management tool where it is expected to reveal knowledge structures that can guide
decisions in conditions of limited certainty. Recently, there has been increased interest in
developing new analytic techniques specifically designed to address the issues relevant to
business Data Mining (e.g., Classification Trees), but Data Mining is still based on the
conceptual principles of statistics including the traditional Exploratory Data Analysis
(EDA) and modeling and it shares with them both some components of its general approaches
and specific techniques.

Components Of Data Warehouse

The data warehouse architecture is based on a relational database


management system server that functions as the central repository for
informational data. Operational data and processing is completely separated
from data warehouse processing. This central information repository is
surrounded by a number of key components designed to make the entire
environment functional, manageable and accessible by both the operational
systems that source data into the warehouse and by end-user query and
analysis tools.
Typically, the source data for the warehouse is coming from the operational
applications. As the data enters the warehouse, it is cleaned up and
transformed into an integrated structure and format. The transformation
process may involve conversion, summarization, filtering and condensation
of data. Because the data contains a historical component, the warehouse
must be capable of holding and managing large volumes of data as well as
different data structures for the same database over time.
The next sections look at the seven major components of data warehousing:
Data Warehouse Database
The central data warehouse database is the cornerstone of the data
warehousing environment. This database is almost always implemented on
the relational database management system (RDBMS) technology. However,
this kind of implementation is often constrained by the fact that traditional
RDBMS products are optimized for transactional database processing.
Certain data warehouse attributes, such as very large database size, ad hoc
query processing and the need for flexible user view creation including
aggregates, multi-table joins and drill-downs, have become drivers for
different technological approaches to the data warehouse database. These
approaches include:

Parallel relational database designs for scalability that include sharedmemory, shared disk, or shared-nothing models implemented on
various multiprocessor configurations (symmetric multiprocessors or
SMP, massively parallel processors or MPP, and/or clusters of uni- or
multiprocessors).

An innovative approach to speed up a traditional RDBMS by using new


index structures to bypass relational table scans.

Sourcing, Acquisition, Cleanup and Transformation Tools

A significant portion of the implementation effort is spent extracting data


from operational systems and putting it in a format suitable for informational
applications that run off the data warehouse.
The data sourcing, cleanup, transformation and migration tools perform all of
the conversions, summarizations, key changes, structural changes and
condensations needed to transform disparate data into information that can
be used by the decision support tool. They produce the programs and control
statements, including the COBOL programs, MVS job-control language (JCL),
UNIX scripts, and SQL data definition language (DDL) needed to move data
into the data warehouse for multiple operational systems. These tools also
maintain the meta data. The functionality includes:

Removing unwanted data from operational databases

Converting to common data names and definitions

Establishing defaults for missing data

Accommodating source data definition changes

Meta data
Meta data is data about data that describes the data warehouse. It is used
for building, maintaining, managing and using the data warehouse. Meta
data can be classified into:

Technical meta data, which contains information about warehouse data


for use by warehouse designers and administrators when carrying out
warehouse development and management tasks.

Business meta data, which contains information that gives users an


easy-to-understand perspective of the information stored in the data
warehouse.

Equally important, meta data provides interactive access to users to help


understand content and find data. One of the issues dealing with meta data
relates to the fact that many data extraction tool capabilities to gather meta
data remain fairly immature. Therefore, there is often the need to create a
meta data interface for users, which may involve some duplication of effort.
Access Tools

The principal purpose of data warehousing is to provide information to


business users for strategic decision-making. These users interact with the
data warehouse using front-end tools. Many of these tools require an
information specialist, although many end users develop expertise in the
tools. Tools fall into four main categories: query and reporting tools,
application development tools, online analytical processing tools, and data
mining tools.
Query and Reporting tools can be divided into two groups: reporting tools
and managed query tools. Reporting tools can be further divided into
production reporting tools and report writers. Production reporting tools let
companies generate regular operational reports or support high-volume
batch jobs such as calculating and printing paychecks. Report writers, on the
other hand, are inexpensive desktop tools designed for end-users.
Managed query tools shield end users from the complexities of SQL and
database structures by inserting a metalayer between users and the
database. These tools are designed for easy-to-use, point-and-click
operations that either accept SQL or generate SQL database queries.
Information Delivery System
The information delivery component is used to enable the process of
subscribing for data warehouse information and having it delivered to one or
more destinations according to some user-specified scheduling algorithm. In
other words, the information delivery system distributes warehouse-stored
data and other information objects to other data warehouses and end-user
products such as spreadsheets and local databases. Delivery of information
may be based on time of day or on the completion of an external event. The
rationale for the delivery systems component is based on the fact that once
the data warehouse is installed and operational, its users don't have to be
aware of its location and maintenance. All they need is the report or an
analytical view of data at a specific point in time. With the proliferation of
the Internet and the World Wide Web such a delivery system may leverage
the convenience of the Internet by delivering warehouse-enabled information
to thousands of end-users via the ubiquitous world wide network.

Benefits Of Data Warehouse


A Data Warehouse Saves Time
Since business users can quickly access critical data from a number of sourcesall in
one placethey can rapidly make informed decisions on key initiatives. They wont
waste precious time retrieving data from multiple sources.
Not only that but the business execs can query the data themselves with little or no
support from ITsaving more time and more money. That means the business users
wont have to wait until IT gets around to generating the reports, and those hardworking
folks in IT can do what they do bestkeep the business running.
A Data Warehouse Enhances Data Quality and Consistency
A data warehouse implementation includes the conversion of data from numerous
source systems into a common format. Since each data from the various departments
is standardized, each department will produce results that are in line with all the other
departments. So you can have more confidence in the accuracy of your data. And
accurate data is the basis for strong business decisions.
A Data Warehouse Provides Historical Intelligence
A data warehouse stores large amounts of historical data so you can analyze different
time periods and trends in order to make future predictions. Such data typically cannot
be stored in a transactional database or used to generate reports from a transactional
system.
A Data Warehouse Delivers Enhanced Business Intelligence
By providing data from various sources, managers and executives will no longer need to
make business decisions based on limited data or their gut. In addition, data
warehouses and related BI can be applied directly to business processes including
marketing segmentation, inventory management, financial management, and sales.

Limitation Of Data Warehouse


Extra Reporting Work
Depending on the size of the organization, a data warehouse runs the risk of extra work
on departments. Each type of data that's needed in the warehouse typically has to be
generated by the IT teams in each division of the business. This can be as simple as
duplicating data from an existing database, but at other times, it involves gathering data
from customers or employees that wasn't gathered before.
Cost/Benefit Ratio
A commonly cited disadvantage of data warehousing is the cost/benefit analysis. A data
warehouse is a big IT project, and like many big IT projects, it can suck a lot of IT man
hours and budgetary money to generate a tool that doesn't get used often enough to
justify the implementation expense. This is completely sidestepping the issue of the
expense of maintaining the data warehouse and updating it as the business grows and
adapts to the market.
Data Ownership Concerns
Data warehouses are often, but not always, Software as a Service implementations, or
cloud services applications. Your data security in this environment is only as good as
your cloud vendor. Even if implemented locally, there are concerns about data access
throughout the company. Make sure that the people doing the analysis are individuals
that your organization trusts, especially with customers' personal data. A data
warehouse that leaks customer data is a privacy and public relations nightmare.
Data Flexibility
Data warehouses tend to have static data sets with minimal ability to "drill down" to
specific solutions. The data is imported and filtered through a schema, and it is often
days or weeks old by the time it's actually used. In addition, data warehouses are
usually subject to ad hoc queries and are thus notoriously difficult to tune for processing
speed and query speed. While the queries are often ad hoc, the queries are limited by
what data relations were set when the aggregation was assembled.

Data Mining Applications


Here is the list of areas where data mining is widely used:

Financial Data Analysis

Retail Industry

Telecommunication Industry

Biological Data Analysis

Other Scientific Applications

Intrusion Detection

FINANCIAL DATA ANALYSIS


The financial data in banking and financial industry is generally reliable and of high
quality which facilitates the systematic data analysis and data mining. Here are the few
typical cases:

Design and construction of data warehouses for multidimensional data analysis


and data mining.

Loan payment prediction and customer credit policy analysis.

Classification and clustering of customers for targeted marketing.

Detection of money laundering and other financial crimes.

RETAIL INDUSTRY
Data Mining has its great application in Retail Industry because it collects large amount
data from on sales, customer purchasing history, goods transportation, consumption
and services. It is natural that the quantity of data collected will continue to expand
rapidly because of increasing ease, availability and popularity of web.
The Data Mining in Retail Industry helps in identifying customer buying patterns and
trends. That leads to improved quality of customer service and good customer retention
and satisfaction. Here is the list of examples of data mining in retail industry:

Design and Construction of data warehouses based on benefits of data mining.

Multidimensional analysis of sales, customers, products, time and region.

Analysis of effectiveness of sales campaigns.

Customer Retention.

Product recommendation and cross-referencing of items.

TELECOMMUNICATION INDUSTRY
Today the Telecommunication industry is one of the most emerging industries providing
various services such as fax, pager, cellular phone, Internet messenger, images, e-mail,
web data transmission etc.Due to the development of new computer and communication
technologies, the telecommunication industry is rapidly expanding. This is the reason
why data mining is become very important to help and understand the business.
Data Mining in Telecommunication industry helps in identifying the telecommunication
patterns, catch fraudulent activities, make better use of resource, and improve quality of
service. Here is the list examples for which data mining improve telecommunication
services:

Multidimensional Analysis of Telecommunication data.

Fraudulent pattern analysis.

Identification of unusual patterns.

Multidimensional association and sequential patterns analysis.

Mobile Telecommunication services.

Use of visualization tools in telecommunication data analysis .

BIOLOGICAL DATA ANALYSIS


Now a days we see that there is vast growth in field of biology such as genomics,
proteomics, functional Genomics and biomedical research.Biological data mining is very
important part of Bioinformatics. Following are the aspects in which Data mining
contribute for biological data analysis:

Semantic integration of heterogeneous , distributed genomic and proteomic


databases.

Alignment, indexing , similarity search and comparative analysis multiple


nucleotide sequences.

Discovery of structural patterns and analysis of genetic networks and protein


pathways.

Association and path analysis.

Visualization tools in genetic data analysis.

OTHER SCIENTIFIC APPLICATIONS


The applications discussed above tend to handle relatively small and homogeneous
data sets for which the statistical techniques are appropriate. Huge amount of data have
been collected from scientific domains such as geosciences, astronomy etc. There is
large amount of data sets being generated because of the fast numerical simulations in
various fields such as climate, and ecosystem modeling, chemical engineering, fluid

dynamics etc. Following are the applications of data mining in field of Scientific
Applications:

Data Warehouses and data preprocessing.

Graph-based mining.

Visualization and domain specific knowledge.

INTRUSION DETECTION
Intrusion refers to any kind of action that threatens integrity, confidentiality, or availability
of network resources. In this world of connectivity security has become the major issue.
With increased usage of internet and availability of tools and tricks for intruding and
attacking network prompted intrusion detection to become a critical component of
network administration. Here is the list of areas in which data mining technology may be
applied for intrusion detection:

Development of data mining algorithm for intrusion detection.

Association and correlation analysis, aggregation to help select and build


discriminating attributes.

Analysis of Stream data.

Distributed data mining.

Visualization and query tools.

You might also like