You are on page 1of 10

Math 533: Applied Managerial Statistics

Course Project B

Introduction
AJ Davis is a department store chain, which has several credit
customers and want to find out more information about these
customers.
The following report presents the detailed statistical analysis of the
data collected from a sample of credit customers in the department
chain store AJ DAVIS.
AJ Davis has complied a sample of fifty (50) credit customers with data
selected in the following variables: Location, Income (in $1,000s), Size
(Number of people living in the household), Years (number of years the
customer has lived in the current location), and Credit Balance
(customers current credit card balance on the stores credit car, in $).
The manager at AJ Davis has speculated the following:
The average (mean) annual income was less than $50,000.
The true population proportion of customers who live in an urban
area exceeds 40%
The average (mean) number of years lived in the current home is
less than 13 years
The average (mean) credit balance for suburban customers is
less than $4300
I will analyze the speculated data listed above by performing
hypothesis test for each of the above situations (using the Seven
elements of a Test Hypothesis with a=.05) in order to see if there is
evidence to support my managers beliefs in each case (a-d), explain
my conclusion in simple terms, compute the p-value with the
interpretation, follow up with computing 95% confidence intervals for
each of the variables described in a. to d. along with interpreting these
intervals. This paper will also include an Appendix with all the steps in
hypothesis testing, as well as the confidence intervals and Minitab
output
In order to understand how hypothesis testing is done it is important
that you know the elements of the Test of Hypothesis, and what each
step means.
The Seven elements of a Test of Hypothesis are:
invalid Hypothesis - A theory about the specific values of
one or more population parameters. The theory generally
represents the status quo, and we accept it until proven
false.

Alternative (research) hypothesis (Ha)- A theory about


the specific values of one or more population parameters.
The theory generally represents the status quo, and we
accept it until proven false
Test statistic - A sample statistic used to decide whether to
reject the invalid hypothesis.
Rejection Region - The numerical values of the test statistic
for which the invalid hypothesis will be rejected.
Assumptions- Clear statements of any assumptions made
about the populations being sampled.
Experiment and calculation of test statisticsPerformance of the sampling experiment and
determination of the numerical value of the test statistic.
Conclusiona. If the numerical value of the test statistic falls in the
rejection region then we reject the invalid hypothesis and
conclude that the alternative is true.
b. If the test statistic does not fall in the rejection region,
then we do not reject H0 as we have insufficient data to do
so.
The average (mean) annual income was less than $50,000
I found that the average annual incomes are 43.74 or $46,060, and
the standard deviation to be 14.64 or $14.064.
Set up Hypothesis Test
Ho: =50
H1: <50
For a= 0.5 and < in the Ha, I found that z= -1.645, so the
Rejection Region would be z<-1.645
Next I calculated the test statistic, using the formula below to
calculate the test statistic z.
where is the mean in the invalid hypothesis and = s/

Z= (43.74-50)/7.0711= 2.08, because =-2.07,because =


14.64/ = 7.07107
The p-value= 0.001. The p-value is another complementary and
equally valid way we can evaluate the invalid and alternative
hypotheses is by looking at the p-value and compare the p-value
to alpha. If the p-value is less than alpha, reject the invalid
hypothesis and accept the alternative hypothesis, at the given
alpha. When you look at the calculated test statistics results you
can see that both the test statistic and the p-value methods have
the same reject or not reject results.
Because the p-value = 0.001 is less than alpha = 0.05: we reject
the invalid hypothesis H0: =50 and we accept the alternative
hypothesis Ha: <50, at =0.05.
My calculated test statistic of -2.07 falls in the rejection region of z
< -0.1645, therefore, I would reject the invalid hypothesis and
say there is sufficient evidence to indicate u<50 or $50,000.

b. The true population proportion of customers who live in an


urban area exceeds 40%
22 of the 50 surveyed live in the Urban area, which is 44% or 0.44,
this is the point estimate for p.
Therefore my hypothesis would be
Ho: = 0.40 vs. Ha: p>0.40
In order to conduct the large sample z-test, we first need to verify
that the sample size is large enough.
nPo= 50(0.40) = 20 and 50 (1-0.44) = 30, both are larger
than 15, so we can conclude that sample size is large
enough to apply the large sample z test.
Z= (0.44 0.400)/ 0.69282= 0.58
where s phat= sqrt (((0.40) (0.60))/50= 0.069282
This is a one tailed (upper or right since HA has >). Our rejection
regions would be z > 1.645.
0.58 is not greater than 1.645 (and is not in the rejection regions)

so we would not Reject the Ho.


The p-value= 0.282. The p-value is another complementary and
equally valid way we can evaluate the invalid and alternative
hypotheses is by looking at the p-value and compare the p-value
to alpha. If the p-value is less than alpha, reject the invalid
hypothesis and accept the alternative hypothesis, at the given
alpha. When you look at the calculated test statistics results you
can see that both the test statistic and the p-value methods have
the same reject or not reject results.
Because the p-value = 0.282 is more than alpha = 0.05: we do not
reject the invalid hypothesis H0: =40 and we do not accept the
alternative hypothesis
Ha: <40, at =.05.
Since we are not rejecting the Ho, we are saying there is insufficient
evidence to conclude the true population of customers who live
in the Suburban location is greater than 40%.

c. The average (mean) number of years lived in the current


home is less than 13 years.
The average number of years in the current home form survey data
to be 12.260, and the standard deviation to be 5.086
Set up Hypothesis Test
Ho: u = 13
H1: u<13
For a = 0005 and < in the Ha, I found that z= -1.645, so the
rejection Region would be z < -1.645
Now I calculate the test statistic
where is the mean in the invalid hypothesis and = s/
z= (12.26 -13)/0.7193= -1.03, because = 5.086/ (50)= 0.7193
Because the p-value = 0.152 is more than alpha = 0.05: we do not
reject the invalid hypothesis H0: =13 and we do not accept the
alternative hypothesis Ha: <13, at =.05.
My calculated test statistic of -1.03 does not fall in the rejection
region of z < -1.645, therefore, we would not reject the invalid

hypothesis and say there is insufficient evidence to indicate


U<13

d. The average (mean) credit balance for suburban customers


is less than $4300.
I found the average credit balance for those surveyed is $3970, and
the standard deviation is 934.49.
Set up Hypothesis Test
Ho: u = 4300
H1: u> 4300
For a = .05 and > in the Ha, I found z= 1.645, so the Rejection
Region would be
z > 1.645.finvalid
Now I calculate the test statistic
where is the mean in the invalid hypothesis and = s/
z= (3970- 4300)/131.8 = -2.50, because = 932/ (50)= 131.8
The p-value= 0.994. The p-value is another complementary and
equally valid way we can evaluate the invalid and alternative
hypotheses is by looking at the p-value and compare the p-value
to alpha. If the p-value is less than alpha, reject the invalid
hypothesis and accept the alternative hypothesis, at the given
alpha. When you look at the calculated test statistics results you
can see that both the test statistic and the p-value methods have
the same reject or not reject results.
Because the p-value = 0.994 is not less than alpha = .05: we do not
reject the invalid hypothesis H0: =4300 and we do not accept
the alternative hypothesis Ha: >4300 at =.05.
My calculated test statistic of -2.50 does not fall in the rejection
region of
Z > -1.645, therefore, I would NOT decline the invalid
hypothesis and say there is insufficient evidence to indicate
U>4300.

Appendix
2) Follow this up with computing 95% confidence intervals for
each of the variables described in a. - d., and gain interpreting
these intervals.
The average (mean) annual income was less than $50,000
One-Sample Z: Income ($1000)
The assumed standard deviation = 14.64
Variable
N Mean StDev SE
Mean
95% CI
Income ($1000) 50
43.74 14.64
2.07 (39.68, 47.80)
Conclusion: According to the confidence interval, we are 95%
confident that the true mean income lies between $39,680 and
$47,800.
b. The true population proportion of customers who live in an
urban area exceeds 40%
Sample X N Sample p
95% CI
Z-Value PValue
1
22 50 0.440000
(0.302411, 0.577589)
0.58
0.564
Conclusion: According to the confidence interval, we are 95%
confident that the mean population lies between 0.302 and 0.577.
c. The average (mean) number of years lived in the current
home is less than 13 years
One-Sample Z: Income ($1000)
The assumed standard deviation = 5.086
Variable
N Mean StDev S E Mean
Income ($1000) 50 43.740 14.640 0.719

95% CI
(42.330, 45.150)

Conclusion: According to the confidence interval, we are 95%


confident that the average mean of people living in their current

homes lies between 42.33 and 45.15.


d. The average (mean) credit balance for suburban customers
is less than $4300
One-Sample Z: Credit Balance($)
The assumed standard deviation = 932
Variable
N Mean StDev SE Mean
Credit Balance($) 50 3970 933.49 132

95% CI
(3712, 4229)

Conclusion: We are 95% confident that the true mean credit balance
lies between $3,712 and $4,229.
Minitab calculations for first part of Part B Project
The average (mean) annual income was less than
$50,000Descriptive Statistics: Income ($1000)
Descriptive Statistics: Income ($1000)
Variable
Mean StDev Minimum Maximum
Income ($1000) 43.74 14.64 21.00 67.00

One-Sample Z
Test of mu = 50 vs < 50
The assumed standard deviation = 14.64
95% Upper
N Mean SE Mean
Bound
Z
P
50 43.74
2.07
47.15 -3.02 0.001

b. The true population proportion of customers who live in an


urban area exceeds 40%.
Location Count Percent
Rural
13 26.00
Suburban
15 30.00
Urban
22 44.00
N=
50

Test and CI for One Proportion


Test of p = 0.4 vs p > 0.4
95% Lower
Sample X N Sample p
Bound Z-Value P-Value
1
22 50 0.440000 0.324532
0.58 0.282

Test and CI for One Proportion


Sample X N Sample p
95% CI
1
22 50 0.440000 (0.302411, 0.577589)

c. The average (mean) number of years lived in the current


home is less than 13 years
Descriptive Statistics: Years
Variable Mean StDev Minimum Maximum
Years
12.260 5.086 1.000 20.000

One-Sample Z: Years
Test of mu = 13 vs < 13
The assumed standard deviation = 5.086
95% Upper
Variable N Mean StDev SE Mean
Bound
Z
P
Years
50 12.260 5.086 0.719
13.443 -1.03 0.152

d. The average (mean) credit balance for suburban customers


is less than $4300
Descriptive Statistics: Credit Balance($)
Variable
Mean StDev Minimum Maximum
Credit Balance($) 3970 933.49
1864
5678

One-Sample Z: Credit Balance($)

Test of mu = 4300 vs > 4300


The assumed standard deviation = 933.49
95%
Lower
Variable
N Mean StDev SE Mean Bound
Z
P
Credit Balance($) 50 3970 933.49
132 3754 -2.50 0.994

You might also like