We are given the following coordinate pairs: (17,36), (27,25), (37,20), (47,12), (57,10), (67,7), and (77,5), where the x-coordinate represents the age in years and the y-coordinate the percent of fatal accidents with speeding as the major cause.
It might be easiest to create a table with the following headings: x, y, xy, x^2, and y^2. Then sum each of the columns:
x: y: xy: x^2: y^2:17 36 612 289 129627 25 675 729 62537 20 740 1369 40047 12 564 2209 14457 10 570 3249 10067 7 469 4489 4977 5 385 5929 25-----------------------------------329 115 4015 18263 2639
(1) Find the mean of x and y:
bar(x)=(sum x)/n=329/7=47
bar(y)=(sum y)/n=115/7 ~~ 16.43
(2) Find a and b for the linear regression line of best fit. (Note that you should check that the correlation is significant—with r about -.959 the correlation is significant at least for a 98% confidence.)
a=((sum y)(sum x^2)-(sum x)(sum xy))/(n(sum x^2)-(sum x)^2)
=(115*18263-329*4015)/(7*18263-329^2) ~~ 39.761
b=(n(sum xy)-(sum x)(sum y))/(n(sum x^2)-(sum x)^2)
=(7*4015-329*115)/(7*18263-329^2) ~~ -.496
(3) The equation of the regression line y'=a+bx is y'=39.760-0.496x
(4) The value of the regression line at x=25 (which is the estimate for the percentage of fatal accidents caused by speeding for 25 year olds) is
y'=39.760-0.496(25)=27.35
http://mathworld.wolfram.com/LeastSquaresFitting.html
No comments:
Post a Comment