Phyton

Saturday, September 19, 2009

THE SIMPLE LINEAR REGRESSION MODEL WITH EXAMLE

The relationship between a response variable Y and a predictor variable X is postulated as a linear model



where and are constants called the model regression coefficients or parameters, and is a random disturbance or error

The coefficient , called the slope, may be interpreted as the change in Y for unit change in X. The coefficient , called the constant coefficient or intercept, is the predicted value of Y when X = 0.

Generalisation to (1),can be written as



where yi represents the ith value of the response variable Y , xi represents the ith value of the predictor variable X, and represents the error in the approximation

PARAMETER ESTIMATION

Based on the available data, we wish to estimate the parameters and . This is equivalent to finding the straight line that gives the best fit. We estimate the parameters using the popular least squares method, which gives the line that minimizes the sum of squares of the vertical distances from each point to the line. The vertical distances represent the errors in the response variable. These errors can be obtained by rewriting (2) as



The sum of squares of these distances can then be written as



The values of and that minimize are given by



and

Note that we give the formula for before the formula for because, uses .

The estimates , and are called the least squares estimates of and because they are the solution to the least squares method, the intercept and the slope of the line that has the smallest possible sum of squares of the vertical distances from each point to the line. For this reason, the line is called the least squares regression line.

The least squares regression line is given by



For each observation in our data we can compute



These are called the fitted values. Thus, the ith fitted value, is the point on the least squares regression line (7) corresponding to .

The vertical distance corresponding to the ith observation is



These vertical distances are called the ordinary least squares residuals


EXAMPLE:

A study was conducted to determine the effects of sleep deprivation on student's ability to solve problems. The amount of sleep deprivation varied over 8, 12, 16, 20, and 24 hours without sleep. A total of ten subjects participated in the study, two at each deprivation levels. After a specified
sleep deprivation period, each subject was administered a set of simple addition problems, and the number of errors was recorded. The following results were obtained:












Number of errors(y)
Number of hours without sleep
8 8
6 8
6 12
10 12
8 16
14 16
14 20
12 20
16 24
12 24

1. Find the least squares regression model
2. Calculate and interpret the result
3. Does the data present sufficient evidence to conclude that there is relationship between the number of errors and the number of hours without sleep?
4. Find the observed significance level for the test and interpret its value
5. Find the coefficient of determination and interpret its value
6. Find the coefficient of correlation and interpret its value
7. Find the significance level for the regression model and interpret its value.
8. Determine the 95% confidence interval and interpret its value


Solution

Computing the OLS(Ordinary least squares) regression line:

we want an equation of the form:



To find the Least Squares regression line we need to find intercept and slope.

The slope of the line, is computed by below formula:



The intercept of the line, , is computed by this basic formula:


Table 1
















































































































































XYX - Avg of X X^2Y- Avg of YXYPredictedTSS = Y^2

SSE=(Y-Predicted)^2


88-864-2.620.86.86.761.4
86-864-4.636.86.821.160.64
126-416-4.618.48.721.67.29
1210-416-0.62.48.70.361.69
16800-2.6010.66.766.76
1614003.4010.611.5611.56
20144163.413.612.511.562.25
20124161.45.612.51.960.25
24168645.443.214.429.162.56
24128641.411.214.41.965.76









106160106320
152
112.440.2

Using the quantities in Table 1, we have



and

Then the equation of the least squares regression line is