You are on page 1of 13

1 ASSIGNMENTS Course Code Course Title Assignment No.

Coverage : : : : MS 08 Quantitative Analysis for Managerial Applications 08/TMA-1/SEM-I/2011 All Blocks

Note: Answer all the questions and send them to the Coordinator of the Study Centre you are attached with. 1. Calculate the mean, median and mode from the following data relating to production of a steel mill for 60 days Production (in tons per day) Number of days Solution : class 20.5-22.5 22.5-24.5 24.5-26.5 26.5-28.5 28.5-30.5 No. of days(f) 7 13 22 10 8 f=60 Mid value (x) 21.5 23.5 25.5 = A 27.5 29.5 d=(x-A)/h -2 -1 0 1 2 d2 4 1 0 1 4 fd -14 -13 0 10 16 fd= -1 fd2 28 13 0 10 32 fd2= 83 c.f. 7 20 42 52 60

21-22 23-24 25-26 27-28 29-30 7 13 22 10 8

a) Mean = A + [ (fd/f) h] Where A = assumed mean h = class size

= 25.5 + [(-1/60) 2] = 25.467 (approx)

b) Median (Q2) = l1 + [{(N/2- p.c.f)/f} l2 l1 Median class = 24.5 26.5 Median = 24.5 + [{(60/2 20) /22} 2] = 25.409 (approx.)

c) Mode = l + [ { (f-f1)/(2f f2 f1) } h]

Modal class = 24.5-26.5

Mode = 24.5 + [{ (22-13)/(222-13-10) } 2] = 25.357 (approx.)

2. A restaurant is experiencing discontentment among its customers. It analyses that there are three factors responsible viz. food quality, service quality and interior dcor. By conducting an analysis, it assesses the probabilities of discontentment with the three factors as 0.40, 0.35 and 0.25 respectively. By conducting a survey among the customers, it also evaluated the probabilities of a customer going away discontented on account of these factors as 0.6, 0.8 and 0.5, respectively. With this information, the restaurant wants to know that, if a customer is discontented, what are the probabilities that it is so due to food, service or interior dcor? Solution : Let A be the case of discontentment due to food. B be the case of discontentment due to service. & C be the case of discontentment due to interior dcor. Now, its given that P(A) = 0.40 P(B) = 0.35 P( C ) = 0.25

Let the Probability of a CUSTOMER going away discontented on account be representated as E. Then, its given that, P(E/A) = 0.6 ( customer going away discontented on due to food) P(E/B) = 0.8 ( P(E/C) = 0.5 ( due to service) due to interior dcor)

Now, Probability of a customer discontented , due to FOOD is given by due to SERVICE due to INTERIOR DCOR P(A/E) P(B/E) P(C/E)

According to Bayes theorem,

a) P(A/E) = {P(E/A) P(A)} {P(E/A) P(A) + P(E/B) P(B) + P(E/C) P(C )} = {0.6 0.40} {0.6 0.40 + 0.8 0.35 + 0.5 0.25} = 0.372 (approx.)

b) P(B/E) = {P(E/B) P(B)} {P(E/A) P(A) + P(E/B) P(B) + P(E/C) P(C )} = {0.80.35} {0.6 0.40 + 0.8 0.35 + 0.5 0.25} = 0.434 (approx.)

c) P(C/E) = {P(E/C) P(C)} {P(E/A) P(A) + P(E/B) P(B) + P(E/C) P(C )} = {0.50.25}{0.6 0.40 + 0.8 0.35 + 0.5 0.25} = 0.194 (approx.)

3. The monthly incomes of a group of 10,000 persons were found to be normally distributed with mean equal to 15,000 and standard deviation equal to 1000. What is the lowest income among the richest 250 persons? Solution :

4. Write short notes on the following: a. Test of goodness of fit b. Critical Region of a test c. Exponential Smoothing Method Solution : a. Test of goodness of fit he goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.g. to test for normality of residuals, to test whether two samples are drawn from identical distributions or whether outcome frequencies follow a specified distribution In the analysis of variance, one of the components into which the variance is partitioned may be a lack-of-fit sum of squares. Fit of distributions In assessing whether a given distribution is suited to a data-set, the following tests and their underlying measures of fit can be used:

KolmogorovSmirnov test; Cramrvon-Mises criterion; AndersonDarling test.

Regression analysis In regression analysis, the following topics relate to goodness of fit:

Coefficient of determination (The R squared measure of goodness of fit); Lack-of-fit sum of squares.

Example One way in which a measure of goodness of fit statistic can be constructed, in the case where the variance of the measurement error is known, is to construct a weighted sum of squared errors:

where 2 is the known variance of the observation.

This definition is only useful when one has estimates for the error on the measurements, but it leads to a situation where a chi-square distribution can be used to test goodness of fit, provided that the errors can be assumed to have a normal distribution. The reduced chi-squared statistic is simply the chi-squared divided by the number of degrees of freedom:

where is the number of degrees of freedom, usually given by N n 1, where N is the number of observations, and n is the number of fitted parameters, assuming that the mean value is an additional fitted parameter. The advantage of the reduced chi-squared is that it already normalizes for the number of data points and model complexity. As a rule of thumb, a large estimated). A indicates a poor model fit. However indicates that the model

is 'over-fitting' the data (either the model is improperly fitting noise, or the error variance has been overindicates that the fit has not fully captured the data (or that the error variance indicates that the extent of the match

has been under-estimated). In principle a value of

between observations and estimates is in accord with the error variance.

Categorical data The following are examples that arise in the context of categorical data. Pearson's chi-square test Pearson's chi-square test uses a measure of goodness of fit which is the sum of differences between observed and expected outcomefrequencies (that is, counts of observations), each squared and divided by the expectation:

where: Oi = an observed frequency (ie count) for the ith bin Ei = an expected (theoretical) frequency for the ith bin, asserted by the null hypothesis. The resulting value can be compared to the chi-square distribution to determine the goodness of fit. In order to determine the degrees of freedom of the chi-squared

distribution, one takes the total number of observed frequencies and subtracts one. For example, if there are eight different frequencies, one would compare to a chi-squared with seven degrees of freedom.

Example: equal frequencies of men and women For example, to test the hypothesis that a random sample of 100 people has been drawn from a population in which men and women are equal in frequency, the observed number of men and women would be compared to the theoretical frequencies of 50 men and 50 women. If there were 44 men in the sample and 56 women, then

If the null hypothesis is true (i.e., men and women are chosen with equal probability in the sample), the test statistic will be drawn from a chi-square distribution with one degree of freedom. Though one might expect two degrees of freedom (one each for the men and women), we must take into account that the total number of men and women is constrained (100), and thus there is only one degree of freedom (2 1). Alternatively, if the male count is known the female count is determined, and vice-versa. Consultation of the chi-square distribution for 1 degree of freedom shows that the probability of observing this difference (or a more extreme difference than this) if men and women are equally numerous in the population is approximately 0.23. This probability is higher than conventional criteria for statistical significance (.001-.05), so normally we would not reject the null hypothesis that the number of men in the population is the same as the number of women (i.e. we would consider our sample within the range of what we'd expect for a 50/50 male/female ratio.) Binomial case A binomial experiment is a sequence of independent trials in which the trials can result in one of two outcomes, success or failure. There aren trials each with probability of success, denoted by p. Provided that npi 1 for every i (where i = 1, 2, ..., k), then

This has approximately a chi-squared distribution with k 1 df. The fact that df = k 1 is a consequence of the restriction . We know there are k observed cell counts, however, once any k 1 are

known, the remaining one is uniquely determined. Basically, one can say, there are only k 1 freely determined cell counts, thus df = k 1.

Other measures of fit The likelihood ratio test statistic is a measure of the goodness of fit of a model, judged by whether an expanded form of the model provides a substantially improved fit.

b. Critical region of a test A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study(not controlled). In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level. The phrase "test of significance" was coined by Ronald Fisher: "Critical tests of this kind may be called tests of significance, and when such tests are available we may discover whether a second sample is or is not significantly different from the first." Hypothesis testing is sometimes called confirmatory data analysis, in contrast to exploratory data analysis. In frequency probability, these decisions are almost always made using null-hypothesis tests (i.e., tests that answer the question Assuming that the null hypothesis is true, what is the probability of observing a value for the test statistic that is at least as extreme as the value that was actually observed?) One use of hypothesis testing is deciding whether experimental results contain enough information to cast doubt on conventional wisdom. A result that was found to be statistically significant is also called a positive result; conversely, a result whose probability under the null hypothesis exceeds the significance level is called a negative result or a null result. Statistical hypothesis testing is a key technique of frequentist statistical inference. The Bayesian approach to hypothesis testing is to base rejection of the hypothesis on the posterior probability. Other approaches to reaching a decision based on data are available via decision theory and optimal decisions. The critical region of a hypothesis test is the set of all outcomes which, if they occur, will lead us to decide that there is a difference. That is, cause the null hypothesis to be rejected in favor of the alternative hypothesis. The critical region is usually denoted by the letter C.

Clairvoyant card game A person (the subject) is tested for clairvoyance. He is shown the reverse of a randomly chosen play card 25 times and asked which suit it belongs to. The number of hits, or correct answers, is called X. As we try to find evidence of his clairvoyance, for the time being the null hypothesis is that the person is not clairvoyant. The alternative is, of course: the person is (more or less) clairvoyant. If the null hypothesis is valid, the only thing the test person can do is guess. For every card, the probability (relative frequency) of guessing correctly is 1/4. If the alternative is valid, the test subject will predict the suit correctly with probability greater than 1/4. We will call the probability of guessing correctly p. The hypotheses, then, are:

null hypothesis

(just guessing)

and

alternative hypothesis

(true clairvoyant).

When the test subject correctly predicts all 25 cards, we will consider him clairvoyant, and reject the null hypothesis. Thus also with 24 or 23 hits. With only 5 or 6 hits, on the other hand, there is no cause to consider him so. But what about 12 hits, or 17 hits? What is the critical number, c, of hits, at which point we consider the subject to be clairvoyant? How do we determine the critical value c? It is obvious that with the choice c=25 (i.e. we only accept clairvoyance when all cards are predicted correctly) we're more critical than with c=10. In the first case almost no test subjects will be recognized to be clairvoyant, in the second case, some number more will pass the test. In practice, one decides how critical one will be. That is, one decides how often one accepts an error of the first kind - a false positive, or Type I error. With c= 25 the probability of such an error is:

and hence, very small. The probability of a false positive is the probability of randomly guessing correctly all 25 times. Being less critical, with c=10, gives:

Thus, c=10 yields a much greater probability of false positive.

10

Before the test is actually performed, the desired probability of a Type I error is determined. Typically, values in the range of 1% to 5% are selected. Depending on this desired Type 1 error rate, the critical value c is calculated. For example, if we select an error rate of 1%, c is calculated thus:

From all the numbers c, with this property, we choose the smallest, in order to minimize the probability of a Type II error, a false negative. For the above example, we select: c = 12. But what if the subject did not guess any cards at all? Having zero correct answers is clearly an oddity too. The probability of guessing incorrectly once is equal to p'=(1-p)=3/4. Using the same approach we can calculate that probability of randomly calling all 25 cards wrong is:

This is highly unlikely (less than 1 in a 1000 chance). While the subject can't guess the cards correctly, dismissing H0 in favour of H1 would be an error. In fact, the result would suggest a trait on the subject's part of avoiding calling the correct card. A test of this could be formulated: for a selected 1% error rate the subject would have to answer correctly at least twice, for us to believe that card calling is based purely on guessing.

c. Exponential Smoothing Method Exponential smoothing is a technique that can be applied to time series data, either to produce smoothed data for presentation, or to make forecasts. The time series data themselves are a sequence of observations. The observed phenomenon may be an essentially random process, or it may be an orderly, but noisy, process. Whereas in the simple moving average the past observations are weighted equally, exponential smoothing assigns exponentially decreasing weights over time. Exponential smoothing is commonly applied to financial market and economic data, but it can be used with any discrete set of repeated measurements. The raw data sequence is often represented by {xt}, and the output of the exponential smoothing algorithm is commonly written as {st}, which may be regarded as a best estimate of what the next value of x will be. When the sequence of observations begins at time t = 0, the simplest form of exponential smoothing is given by the formulas

11

where is the smoothing factor, and 0 < < 1.

The simple moving average Intuitively, the simplest way to smooth a time series is to calculate a simple, or unweighted, moving average. The smoothed statistic st is then just the mean of the last k observations:

where the choice of an integer k > 1 is arbitrary. A small value of k will have less of a smoothing effect and be more responsive to recent changes in the data, while a larger k will have a greater smoothing effect, and produce a more pronounced lag in the smoothed sequence. One disadvantage of this technique is that it cannot be used on the first k 1 terms of the time series.

The weighted moving average A slightly more intricate method for smoothing a raw time series {xt} is to calculate a weighted moving average by first choosing a set of weighting factors

such that and then using these weights to calculate the smoothed statistics {st}:

In practice the weighting factors are often chosen to give more weight to the most recent terms in the time series and less weight to older data. Notice that this technique has the same disadvantage as the simple moving average technique (i.e., it cannot be used until at least kobservations have been made), and that it entails a more complicated calculation at each step of the smoothing procedure. In addition to this disadvantage, if the data from each stage of the averaging is not available for analysis, it may be difficult if not impossible to reconstruct a changing signal accurately (because older samples may be given less weight). If the number of stages missed is known however, the weighting of values in the average can be adjusted to give equal weight to all missed samples to avoids this issue.

12

The exponential moving average The simplest form of exponential smoothing is given by the formulae

where is the smoothing factor, and 0 < < 1. In other words, the smoothed statistic st is a simple weighted average of the previous observation xt-1 and the previous smoothed statistic st1. The term smoothing factor applied to here is something of a misnomer, as larger values of actually reduce the level of smoothing. In the limiting case with = 1 the output series is just the same as the original series. Simple exponential smoothing is easily applied, and it produces a smoothed statistic as soon as two observations are available. Values of close to one have less of a smoothing effect and give greater weight to recent changes in the data, while values of closer to zero have a greater smoothing effect and are less responsive to recent changes. There is no formally correct procedure for choosing . Sometimes the statistician's judgment is used to choose an appropriate factor. Alternatively, a statistical technique may be used to optimizethe value of . For example, the method of least squares might be used to determine the value of for which the sum of the quantities (sn-1 xn-1)2 is minimized. This technique technically does not share disadvantage where it cannot be used until a minimum number of observations have been made, though in practice a "good average" will not be achieved until several samples have been averaged together (a constant signal will take approximately 3/ stages to reach 95% of the actual value). To accurately reconstruct the original signal without information loss all stages of the exponential moving average must also be available (because older samples decay in weighting exponentially). In the simple moving average some samples can be skipped without as much loss of information, due to the constant weighting of samples within the average. If a known number of samples will be missed, a weighted average can be adjusted for this as well, by giving equal weight to the new sample and all those to be skipped. This simple form of exponential smoothing is also known as an exponentially weighted moving average (EWMA). Technically it can also be classified as an Autoregressive integrated moving average (ARIMA) (0,1,1) model with no constant term.

13

By direct substitution of the defining equation for simple exponential smoothing back into itself we find that

In other words, as time passes the smoothed statistic st becomes the weighted average of a greater and greater number of the past observations xtn, and the weights assigned to previous observations are in general proportional to the terms of the geometric progression {1, (1 ), (1 )2, (1 )3, }. A geometric progression is the discrete version of an exponential function, so this is where the name for this smoothing method originated.

Double exponential smoothing Simple exponential smoothing does not do well when there is a trend in the data. In such situations, double exponential smoothing can be used. Again, the raw data sequence of observations is represented by {xt}, beginning at time t = 0. We use {st} to represent the smoothed value for time t, and {bt} is our best estimate of the trend at time t. The output of the algorithm is now written as Ft+m, an estimate of the value of x at time t+m, m>0 based on the raw data up to time t. Double exponential smoothing is given by the formulas

where is the data smoothing factor, 0 < < 1, is the trend smoothing factor, 0 < < 1, and b0 is taken as (xn-1 - x0)/(n - 1) for somen > 1. Note that F0 is undefined (there is no estimation for time 0), and according to the definition F1=s0+b0, which is well defined, thus further values can be evaluated.

You might also like