文档库 最新最全的文档下载
当前位置:文档库 › Exercise sheet 1

Exercise sheet 1

Exercise sheet 1
Exercise sheet 1

Exercise Sheet 1: Sketched solutions

1.

i) Define the simple linear correlation coefficient, r , between the observed values X i and Y i (for i = 1, 2, …, n ) of two variables X and Y .

r =

Cov(X,Y)

√Var(X)√Var(Y)

r =∑(X i ?X

?)(Y i ?Y ?)n √(X i ?X ?)n √(Y i ?Y ?)

n

=∑??√∑i ?2√∑i ?2 ii) Suppose an ordinary least squares linear regression is undertaken between Y and X :

Y = a 1 + b 1X .

Show that the coefficient of determination of this regression, R 2, is equal to r 2 defined in (i). Hence prove that

-1 ≤ r ≤ 1.

R 2

=b ?1∑(X i ?X ?)(Y i ?Y ?)(Y i ?Y

?)

Substituting in b ?1=∑(X i ?X ?)(Y i ?Y ?)

∑(X i ?X

) gives

R 2

=[∑(X i ?X ?)(Y i ?Y ?)]2i ?i ?=r 2

Since R 2 ≤ 1, r 2 ≤ 1 so -1 ≤ r ≤ 1

iii) Now suppose an ordinary least squares linear regression is undertaken between X and Y :

X = a 2 + b 2Y .

What is the relationship between 2

?b

in this regression with 1

?b estimated in (ii)?

b ?2=∑(X i ?X

?)(Y i ?Y ?)(Y i ?Y

?)

Now

b ?2b ?1=[∑(X i ?X ?)(Y i ?Y ?)]2(Y i ?Y

?)(X i ?X ?)=R 2

b ?2=R 2b ?1?

Can we deduce the estimate of 2?a

from the regression statistics in (ii)?

a ?2=X

??b ?2Y ?

All statistics calculated for the first regression

iv) In general, why is 12?1?b b ≠? When will 1

2?1?b b =? Illustrate your answer.

1

2?1?b

b = when R 2 2. At Birmingham University 7 economics undergraduates were randomly selected from the population and surveyed. They were asked: i) What was the average number of hours of lectures per week you attended last year? ii)

What was the average number of hours per week you spent drinking last year?

The following data was collected:

Student Hours of lectures Hours of drinking 1 3.6 3 2 2.2 15 3 3.1 8 4 3.5 9 5 2.7 12 6 2.6 12 7 3.9 4

a)

Estimate the ordinary least squares equation Y = α+ βX , where Y is the average

number of hours of lectures and X represents the average number of hours drinking per week. Y X X*X X*Y Y*Y 3.6 3 9 10.8 12.96 2.2 15 225 33 4.84 3.1 8 64 24.8 9.61 3.5 9 81 31.5 12.25 2.7 12 144 32.4 7.29 2.6 12 144 31.2 6.76 3.9 4 16 15.6 15.21 Sums 21.6 63 683 179.3 68.92

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.930833

R Square 0.86645 Adjusted R Square 0.83974

Standard Error 0.246158 Observations 7

ANOVA

df SS MS F Significance

F

Regression 1 1.965603 1.965603 32.43913 0.002328 Residual 5 0.302968 0.060594

Total 6 2.268571

Coefficients Standard

Error t Stat P-value Lower 95%

Upper

95%

Lower

95.0%

Upper

95.0%

Intercept 4.257266 0.225759 18.85754 7.72E-06 3.676933 4.837599 3.676933 4.83759 X Variable 1 -0.13017 0.022855 -5.69554 0.002328 -0.18892 -0.07142 -0.18892 -0.0714

b) Calculate R-squared and the standard error of the equation.

c) Test the null hypothesis H0: β = 0.

3. The following are data on: Y = quit rate per 100 employees in manufacturing

X = unemployment rate

Year Y X Year Y X

2000 1.3 6.2 2007 2.3 3.6

2001 1.2 7.9 2008 2.5 3.3

2002 1.4 5.9 2009 2.7 3.3

2003 1.4 5.7 2010 2.1 5.6

2004 1.5 5.0 2011 1.9 6.9

2005 1.9 4.0 2012 2.2 5.6

2006 2.6 3.2

a) Calculate a regression of Y on X.

Y = α + βX + ε

SUMMARY OUTPUT

Regression Statistics Multiple R 0.789141 R Square 0.622743 Adjusted R Square 0.588447 Standard Error 0.335554 Observations 13

ANOVA

df SS MS F Significance

F

Regression 1 2.044514 2.044514 18.15786 0.00134 Residual 11 1.238563 0.112597

Total 12 3.283077

Coefficients Standard

Error t Stat P-value Lower 95%

Upper

95%

Lower

95.0%

Upper

95.0%

Intercept 3.315391 0.339737 9.758685 9.43E-07 2.567634 4.063148 2.567634 4.06314 X Variable 1 -0.27342 0.064164 -4.2612 0.00134 -0.41464 -0.13219 -0.41464 -0.1321

b) Construct a 95% confidence interval for α.

c) Test the null hypothesis H0: β = 0 at the 5% significance level.

d) Construct a 90% confidence interval for σ2.

χσσχσ22),0.05

-(n 2222),0.95-(n 2

2)-(n <

<2)-(n ??

n = 13 σ? = 0.335554

χ25,0.011= 4.575 χ2,0.9511= 19.675

575

.4335554.011675.19335554.011222<

0.06295 <<2σ 0.271

e) Calculate R -squared.

f) Are there any reasons why the assumptions of the classical normal linear model are invalid? Consider

Quits t = α + βUnempl t + εt

E (Unempl t εt ) ≠ 0

相关文档