Professional Documents
Culture Documents
5
-2- 07.10.2004
Copyright
No part of this publication may be reproduced or transmitted in any form or for any purpose without the
express permission of SAP AG. The information contained herein may be changed without prior
notice.
Some software products marketed by SAP AG and its distributors contain proprietary software
components of other software vendors.
Microsoft, WINDOWS, NT, EXCEL, Word, PowerPoint and SQL Server are registered
trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390,
AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner,
WebSphere, Netfinity, Tivoli, Informix and Informix Dynamic ServerTM are trademarks of IBM
Corporation in USA and/or other countries.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, the Citrix logo, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame,
MultiWin and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc.
HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W 3C, World Wide Web
Consortium, Massachusetts Institute of Technology.
JAVASCRIPT is a registered trademark of Sun Microsystems, Inc., used under license for
technology invented and implemented by Netscape.
SAP, SAP Logo, R/2, RIVA, R/3, SAP ArchiveLink, SAP Business Workflow, WebFlow,
SAP EarlyWatch, BAPI, SAPPHIRE, Management Cockpit, mySAP, mySAP.com, and other
SAP products and services mentioned herein as well as their respective logos are
trademarks or registered trademarks of SAP AG in Germany and in several other countries
all over the world. MarketSet and Enterprise Buyer are jointly owned trademarks of
SAP Markets and Commerce One. All other product and service names mentioned are the
trademarks of their respective owners.
Scoring BW 3.5
-3- 07.10.2004
Contents
Scoring BW 3.5
-4- 07.10.2004
Scoring BW 3.5
-5- 07.10.2004
Advertising Measures
You can use Weighted Score Tables to define a customer valuation that is dependent on the
characteristics and key figures of the customers. You can use valuation to support advertising
measures, to make service offers or rebates, or to select customers of interest for other purposes.
Scoring BW 3.5
-6- 07.10.2004
Typical Input
The following table contains data that could make up part of the typical input data for weighted score
tables. The customer id is the tables key, and state, salary and status are discrete fields. The
remaining fields are continuous.
Customer No of
ID State Salary Children Status
1 WA USD50K - USD70K 1 BRONZE
2 BC USD70K - USD90K 1 BRONZE
3 WA USD50K - USD70K 1 BRONZE
4 BC USD10K - USD30K 4 NORMAL
5 CA USD30K - USD50K 3 SILVER
6 WA USD70K - USD90K 3 BRONZE
7 OX USD30K - USD50K 2 BRONZE
8 DF USD50K - USD70K 2 BRONZE
9 BC USD10K - USD30K 5 NORMAL
10 OR USD30K - USD50K 4 GOLDEN
11 CA USD50K - USD70K 4 BRONZE
12 WA USD50K - USD70K 1 BRONZE
Scoring BW 3.5
-7- 07.10.2004
Typical Output
The weighted score tables results can be displayed in a tabular form showing the individual records
and their result values.
In the above example, the partial score for each model field, that is, Number of Children, State, Salary
and Status, is displayed.
Scoring BW 3.5
-8- 07.10.2004
The Analysis Process Designer (APD) is the application environmen t for the SAP data mining solution.
From SAP BW Release 3.5, data mining functions are fully integrated into the APD. You can perform
the following functions in the APD:
For more information, see SAP Library at help.sap.com under SAP NetWeaver -> Release 04 ->
Information Integration -> SAP Business Information Warehouse -> BI Platform -> Analysis Process
Designer / Data Mining
Scoring BW 3.5
-9- 07.10.2004
Model Fields
Model Parameters
1 2
Model fields are the attributes that define the object and the predictable field is the class label. In
Model Fields screen, you can add the fields that are required for creating decision trees. You must
define the content type for each model field.
Continuous: Continuous data can have any value in an interval of real numbers. This implies that the
value does not have to be an integer. Attributes having infinite set of possible real values are called
Continuous. Typically, they have a Minimum and Maximum value and attribute values could be
anything within this interval. Attributes like Salary, Sales Revenue, Quantity sold etc are examples of
Continuous attributes. You can discretize a Continuous attribute by defining fixed intervals. For
example, if the salary ranges from $100 to $20000, then we can form intervals like $0 2000, $2000
$4000, $4000 $6000. $18000 $20000. An attribute value will fall into any one of these intervals.
Scoring BW 3.5
- 10 - 07.10.2004
Discrete: For model fields of type discrete, you specify the individual values of the field. As described
below, you can define a common partial weight for some of the remaining values. This weight is
applied only if you have set the Treat as separate instance indicator in the Outlier treatment tab page.
For more detailed information about handling outliers, see the section Parameters of the Model Fields
for Treating Outliers.
For a model field of type Discrete, the field parameters are as follows:
Weight of Model Field: You can define the weight of a model field.
Value: You enter the values for the model field in the column.
Partial Score: For each model field value that you enter, you must specify a model field weight. The
weights for the model fields determine the share of partial weights that the score has.
Partial Score for remaining values: You can define a single weight for all the remaining values in the
dataset.
The score is calculated as follows:
Score (Field1, Field2 ...) = Weight1 x Partial Weight1 (Field1) + Weight2 x Partial Weight2 (Field2) + ...
Scoring BW 3.5
- 11 - 07.10.2004
In the case of continuous model fields, you specify partial weights for individual threshold values. You
must also specify how to deal with values of partial weights between the threshold values. You have
the following options:
Function is piecewise constant: You have to specify the function of the partial weights between the
threshold values. Check this option to specify the function of the partial weight as piecewise constant.
If you check this option, then the function is constant between each pair of threshold values . That is,
the partial weight of the left threshold and of the right threshold is considered for each setting.
Alternatively, if you do not set the function is piecewise constant indicator, linear interpolation is
applied to calculate the partial weight continuously between each pair of threshold values.
In each case, you have to specify at least two threshold values because the value range used for
outlier treatment lies above the largest threshold and beneath the smallest threshold.
The score value to be defined is dependent on the discrete model field Status and the
continuous model field Salary. The weighting of these two model fields should be 3 and 1
respectively. In the model field Status, the data to be processed takes the values gold,
silver, bronze, copper, and iron. The following partial weightings could then be specified:
Value Partial Weighting
Scoring BW 3.5
- 12 - 07.10.2004
Gold 10
Silver 6
Bronze 4
The partial weighting 2 can be assigned to the remaining values. For the model field
Salary, the threshold values and corresponding partial weightings could be assigned as
follows:
Threshold Value Partial Weighting
0 0
10 000 10
25 000 20
50 000 30
The partial weightings function should be piecewise constant and take the partial
weighting of the left threshold value in the interval between two threshold values. In this
way, the score value (silver, 40 000) = 3 x 6 + 1 x 20 = 38 is obtained. If the partial
weightings function for the income should be continuous instead of piecewise constant,
then it produces the score value (silver, 40 000) = 3 x 6 + 1 x 26 = 44.
If the Treat as separate instance option was selected in outlier handling for the model field
Status, then the function produces the score value (iron, 10 000) = 3 x 2 + 1 x 10 = 16.
If the Constant extrapolation option was chosen in outlier handling for the model field
Income, then the function produces the score value (silver, 60 000) = 3 x 6 + 1 x 30 = 48.
If the Extrapolation option is chosen, this produces the score value (silver, 60 000) = 3 x 6
+ 1 x 34 = 52.
For more detailed information about how to treat outliers, see the section Parameters of the Model
Fields for Treating Outliers.
Scoring BW 3.5
- 13 - 07.10.2004
Scoring BW 3.5
- 14 - 07.10.2004
Scoring BW 3.5
- 15 - 07.10.2004
For treating missing values, you first have to set the appropriate indicator and identify a missing value.
If, for example, the size of family is denoted by a numeric value and NA has been used to denote a
value that is unknown, you can enter NA as the Missing Value. You define a separate treatment for
this value accordingly.
You can make a setting to decide whether processing is stopped, the record is ignored, or the default
score is set when a value defined in this way occurs. Using the option Replace by value, you can
substitute the missing value with another value.
Default Score
You use this parameter to specify a default output value for weighted score tables. If required, this
value is always set whenever a record does not fulfill certain conditions (for example, it has missing
data or outliers). The default value for this field is 0 (zero).
Scoring BW 3.5
- 16 - 07.10.2004
Algorithm
Weighted Score Tables
A function f that is defined by weighted score tables is a linear combination of functions of a variable.
f ( x1 ,..., x n ) = w1 f 1 ( x1 ) + ... + wn f n ( x n )
The weights w1 ,..., wn are arbitrary numbers. Each of the functions f1 ,..., f n is mapped to exactly one
model field. The arguments x1 ,..., xn of these functions are those values that the model fields can
take.
For discrete model fields, the score table of the model field is used to directly assign a function value
f i ( xi ) to individual values xi of the model field. A common function value can be assigned to values
that are not listed explicitly in the table.
For continuous model fields, the score table of the model field is also used to d irectly assign a function
value xi to individual values f i ( xi ) of the model field. Either a linear interpolation is made between
two points, or the function value from the left or right point is taken. Respectively, either a polygon line
or a piecewise constant function is defined. Depending on the option selected by the user, the function
is continued as linear or continuous beyond the outer points.
Let us assume that you would like to valuate your customer data on the basis of the fields Occupation
and Age. For this, you could define a weighted score table function as follows:
Score (Occupation, Age) = w 1 x ps1(Occupation) + w2 x ps2 (Age)
w1 and w2 stand for the weights you give the two fields, such as:
ps1 and ps2 stand for the functions with which you define partial scores for both fields, as in the
following table:
0 0
20 6
Scoring BW 3.5
- 17 - 07.10.2004
30 10
50 4
65 2
Ages falling between those specified above should be interpolated. This then gives you, for example:
Score (Employee,25) = 2 x ps 1(Employee) + 5 x ps 2(25) = 2 x 5 + 5 x 8 = 50
Scoring BW 3.5