<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>

<head>
<meta http-equiv="Content-Type" content="text/html; charset=unicode">
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<title>Multiple Regression &amp; Correlation Example, Dr. Usip, Economics</title>
</head>

<body>

<h1 align="center"><font color="#FF0000"><b><a name="Multiple Regression">Multiple
Regression</a> <a name="top">&amp;</a> Correlation Example</b></font></h1>

<p><font color="#800040" size="5"><strong>Motivation:</strong> </font><font size="3"
color="#000000">Oftentimes, it may not be realistic to conclude that only one factor or
IV&nbsp; influences the behavior of the DV. In such situations, a researcher needs to
carefully identify those other possible factors and explicitly include them in the Linear
Regression Model (LRM). Existing economic theory or common sense should constitute a basis
for selecting the IVs; and where data on a theoretically construed variable is not readily
available a proxy should be carefully chosen.&nbsp; </font></p>

<p><font size="3" color="#000000">This tutorial will illustrate the key steps involved in
using multiple regression and correlation to solve real world problems. The example will
consider a multiple LRM which typically has the form:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</font><font color="#800040" size="4">Y<sub>i</sub> =<blink> </blink>A +<blink> </blink>B<sub>1</sub>X<sub>i,1</sub><blink>
</blink>+ B<sub>2</sub>X<sub>i,2</sub><blink> </blink>+ ... + B<sub>j</sub>X<sub>i,j</sub> +
E<sub>i</sub><blink>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<br>
</blink></font><font color="#000000">where X<sub>j</sub><sup>s</sup> are the IVs; A,&nbsp;
B<sub>j</sub> (j = 1, 2, ..., K) are the regression parameters or coefficients and reflect
the partial effect of the associated IV, holding the effects of all other IVs constant; K
is the number of IVs in the model; and E<sub>i</sub> is the random error term.&nbsp;
Again, </font><font size="3">note that in <strong><a
href="glossary.htm#Regression Analysis">regression analysis</a></strong>, all of the
underlying <a href="glossary.htm#Classical Assumptions"><strong>classical assumptions</strong></a>
essentially apply to this random error term.</font><font color="#000000"> In multiple
regression the three most crucial ones are&nbsp; the assumptions of no <a
href="glossary.htm#Multicollinearity"><strong>multicollinearity</strong></a> among the
IVs, of no <a href="glossary.htm#Heteroskedasticity"><strong>heteroskedasticity</strong></a>
in the error variances, and of no <a href="glossary.htm#Autocorrelation"><strong>autocorrelation</strong></a>
in the errors for all i. <br>
<br>
</font><font size="5" color="#800040"><strong>Step 1: Formulate the LRM and State the
Expected Signs<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; of the Regression
Parameters</strong></font><font color="#000000"><br>
When specifying a LRM theory or common sense should be your guide in stipulating, a
priori, the expected signs of the regression parameters.<br>
<br>
Let us return to the family food expenditure example that we introduced in the <a
href="simple_reg.htm#Simple Regression">simple regression</a> tutorial.&nbsp; In that
tutorial,&nbsp; the only factor that was explicitly identified as the predictor of annual
family <strong>Food Expenditure</strong> (Y) was <strong>Income</strong> (henceforth
denoted as X<sub>1</sub>). &nbsp; The effects of all other predictors were assumed away or
held constant. &nbsp; We now extend the model to include another important determinant,
viz., the family size (X<sub>2</sub>) which is easily measured in terms of the number of
people in a family.&nbsp; The representative LRM has the form: <br>
<br>
</font><font size="3" color="#000000">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</font><font color="#800040" size="4">Y<sub>i</sub><blink> = </blink>A + B<sub>1</sub>X<sub>i,1</sub><blink>
</blink>+ B<sub>2</sub>X<sub>i,2</sub> + E<sub>i</sub> <blink>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<br>
</blink></font><font color="#000000">Based on economic theory, we should expect the signs
of </font><font color="#800040" size="4">B<sub>1</sub></font><font color="#000000"> and </font><font
color="#800040" size="4">B<sub>2</sub></font><font color="#000000"> to be positive; in
other words, both the family <strong>Income</strong> and <strong>Size</strong>,
respectively, are expected to have positive effects on the family food expenditure. Note
that </font><font color="#800040" size="4">B<sub>1</sub></font><font color="#000000">
measures the partial effect of Income on family food expenditure, holding family size
constant; whereas </font><font color="#800040" size="4">B<sub>2</sub></font><font
color="#000000"> measures the partial effect of family Size on Food Expenditure, holding
income constant.&nbsp; Also, note that holding one IV constant while examining the effect
of the other assumes that there is no collinearity between the two IVs.&nbsp; The sign of </font><font
color="#800040" size="4">A</font><font color="#000000"> could be positive or negative and
indeed </font><font color="#800040" size="4">A</font><font color="#000000"> may or may not
have an interpretable meaning. Nonetheless, always include the intercept term in your
model -- more on this in Econs 853 and 976. <br>
<br>
</font><font size="5" color="#800040"><strong>Step 2: Examine the DataVisually for
Inherent Patterns<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; with a Scatterplot
Matrix.</strong></font><font color="#000000"><br>
It is always advisable to do some exploratory analysis of the data to uncover inherent
patterns as to the type and strength of relationship among the variables as well as the
presence of outliers in the data. The&nbsp; scatterplot matrix is a useful graphical
device for doing so. While a strong linear association between the DV and each of the IV
is highly desirable, a strong linear association between (or among) the IVs is highly
undesirable since it is indicative of the presence of <strong>collinearity (or
multicollinearity)</strong> problem in the model. The consequences of
collinearity/multicollinearity will be treated in Econs 853 and 976.</font></p>

<p><font color="#000000">For this example, the </font><a
href="scattergram.htm#Scattergram"><font size="3">data set</font></a><font color="#000000">
for the simple regression analysis has been augmented to include data on <strong>X<sub>2</sub></strong>.
&nbsp; The results of the preliminary analysis of the data are discussed separately in the
<a href="scattermatrix.htm#Scatterplot Matrix">scatterplot matrix</a> component. <br>
<br>
After studying the results for reasonable inferences, the next phase of the data analysis
is to estimate the LRM. Estimating the embedded parameters of the population regression
plane (PRP) is accomplished by fitting the sample regression plane (SRP) to a sample of
data on all the variables of the model.&nbsp;&nbsp;<br>
<br>
</font><font color="#800040" size="5">Step 3: Estimate the SRP<br>
</font><font size="3">Again, the estimation method is the classical <a
href="glossary.htm#OLS"><strong>Ordinary Least Squares (OLS)</strong></a> technique which
is applied to the sample regression plane (SRP) that has the form:<br>
<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</font><font color="#0000FF">y<sub>i</sub><blink> </blink>=<blink> </blink>a<blink> </blink>+<blink>
</blink>b<sub>1</sub>X<sub>i,1</sub><blink> </blink>+<blink> </blink>b<sub>2</sub>X<sub>i,2</sub>
+ e<sub>i</sub></font> &nbsp;&nbsp;&nbsp; or&nbsp;&nbsp; <font color="#0000FF"><strong>ý<sub>i</sub></strong></font>
<font color="#0000FF">=</font>  <font color="#0000FF">a<blink> </blink>+<blink> </blink>b<sub>1</sub>X<sub>i,1</sub><blink>
</blink>+<blink> </blink>b<sub>2</sub>X<sub>i,2</sub></font><br>
Note that <font size="4" color="#0000FF">y<sub>i</sub></font> and <strong><font size="4"
color="#0000FF">ý</font><font color="#0000FF" size="1"><sub>i</sub></font></strong> are
the actual/observed ant the predicted/estimated value of Y, respectively, (for all i = 1,
2, ..., n).&nbsp; '<font color="#0000FF"><strong>a</strong></font>' and '<font
color="#0000FF"><strong>b<sub>1</sub></strong></font>' , and '<font color="#0000FF"><strong>b<sub>2</sub></strong></font>'
are the estimators of <font color="#0000FF"><strong>A</strong></font>, <font
color="#0000FF"><strong>B<sub>1</sub></strong></font>, and <font color="#0000FF"><strong>B<sub>2</sub></strong></font>,
respectively. '<font color="#0000FF"><strong>e</strong></font>' denotes the residual
(defined as <font color="#0000FF"><strong>e</strong></font> = <font size="4"
color="#0000FF">y<sub>i</sub></font> - <strong><font size="4" color="#0000FF">ý</font><font
color="#0000FF" size="1"><sub>i</sub></font></strong>) and is the estimator of the random
error term <font color="#0000FF"><strong>E</strong></font>. <br>
<font size="3"><br>
The OLS method is programmed into the SPSS/win statistical package.&nbsp; Using the
command sequence presented earlier will automatically implements this method.&nbsp; The
following outputs contain the necessary results which are based on selected options that
are accessible via the 'Statistics...' button.</font></p>
<div align="center"><center>

<table BORDER="1" CELLPADDING="5">
  <caption><b>Descriptive Statistics</b> </caption>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000"><br>
    </font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Mean</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Std. Deviation</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">N</font></th>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Food Expenditure ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">7.965</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">4.664</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Income ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">45.50</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">23.96</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Family Size</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">2.95</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">1.61</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
</table>
</center></div>

<p>&nbsp;</p>

<p>&nbsp;</p>
<div align="center"><center>

<table BORDER="1" CELLPADDING="5">
  <caption><b><br>
  Correlations</b> </caption>
  <tr>
    <th COLSPAN="2" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000"><br>
    </font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Annual Food Expenditure ($000)</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Annual Income ($000)</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Family Size</font></th>
  </tr>
  <tr>
    <th ROWSPAN="3" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Pearson Correlation</font></th>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Food Expenditure ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.946</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.787</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Income ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.946</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.676</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Family Size</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.787</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.676</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.000</font> </td>
  </tr>
  <tr>
    <th ROWSPAN="3" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Sig. (1-tailed)</font></th>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Food Expenditure ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.000</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Income ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.001</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Family Size</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.001</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.</font> </td>
  </tr>
  <tr>
    <th ROWSPAN="3" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">N</font></th>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Food Expenditure ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Income ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Family Size</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">20</font> </td>
  </tr>
</table>
</center></div><div align="center"><center>

<table BORDER="1" CELLPADDING="5">
  <caption><b><br>
  Model Summary(b)</b> </caption>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Model</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">R</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">R Square</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Adjusted R Square</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Std. Error of the Estimate</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Durbin-Watson</font></th>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">1</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.967(a)</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.935</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.927</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.261</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">2.616</font> </td>
  </tr>
  <tr>
    <td COLSPAN="6" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">a Predictors:
    (Constant), Family Size , Annual Income ($000)</font></td>
  </tr>
  <tr>
    <td COLSPAN="6" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">b Dependent Variable:
    Annual Food Expenditure ($000)</font></td>
  </tr>
</table>
</center></div><div align="center"><center>

<table BORDER="1" CELLPADDING="5">
  <caption><b><br>
  ANOVA(b)</b> </caption>
  <tr>
    <th COLSPAN="2" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Model</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Sum of Squares</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">df</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Mean Square</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">F</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Sig.</font></th>
  </tr>
  <tr>
    <th ROWSPAN="3" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">1</font></th>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Regression</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">386.313</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">2</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">193.156</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">121.470</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.000(a)</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Residual</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">27.033</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">17</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.590</font> </td>
    <td BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></td>
    <td BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Total</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">413.346</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">19</font> </td>
    <td BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></td>
    <td BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></td>
    <td BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></td>
  </tr>
  <tr>
    <td COLSPAN="7" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">a Predictors:
    (Constant), Family Size , Annual Income ($000)</font></td>
  </tr>
  <tr>
    <td COLSPAN="7" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">b Dependent Variable:
    Annual Food Expenditure ($000)</font></td>
  </tr>
</table>
</center></div>

<p>&nbsp;</p>

<p>&nbsp;</p>
<div align="center"><center>

<table BORDER="1" CELLPADDING="5">
  <caption><b>Coefficients(a)</b> </caption>
  <tr>
    <th COLSPAN="2" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000"><br>
    </font></th>
    <th COLSPAN="2" BGCOLOR="#FFFFFF"><font COLOR="#000000">Unstandardized Coefficients</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Standardized Coefficients</font></th>
    <th ROWSPAN="2" BGCOLOR="#FFFFFF"><font COLOR="#000000">t</font></th>
    <th ROWSPAN="2" BGCOLOR="#FFFFFF"><font COLOR="#000000">Sig.</font></th>
  </tr>
  <tr>
    <th COLSPAN="2" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Model</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">B</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Std. Error</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Beta</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></th>
  </tr>
  <tr>
    <th ROWSPAN="3" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">1</font></th>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">(Constant)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">-1.118</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.655</font> </td>
    <td BGCOLOR="#FFFFFF"><font COLOR="#000000"><br>
    </font></td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">-1.708</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.106</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Annual Income ($000)</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.148</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.016</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.761</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">9.049</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.000</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Family Size</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.793</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.244</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.273</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">3.245</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.005</font> </td>
  </tr>
  <tr>
    <td COLSPAN="7" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">a Dependent Variable:
    Annual Food Expenditure ($000)</font></td>
  </tr>
</table>
</center></div><div align="center"><center>

<table BORDER="1" CELLPADDING="5">
  <caption><b><br>
  Residuals Statistics(a)</b> </caption>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000"><br>
    </font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Minimum</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Maximum</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Mean</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">Std. Deviation</font></th>
    <th BGCOLOR="#FFFFFF"><font COLOR="#000000">N</font></th>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Predicted Value</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">3.232</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20.240</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">7.965</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">4.509</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Residual</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">-2.586</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">2.206</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">1.110E-16</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.193</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Std. Predicted Value</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">-1.050</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">2.722</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">1.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <th BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">Std. Residual</font></th>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">-2.051</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">1.750</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">.000</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="center"><font COLOR="#000000">.946</font> </td>
    <td BGCOLOR="#FFFFFF" ALIGN="RIGHT"><font COLOR="#000000">20</font> </td>
  </tr>
  <tr>
    <td COLSPAN="6" BGCOLOR="#FFFFFF" ALIGN="LEFT"><font COLOR="#000000">a Dependent Variable:
    Annual Food Expenditure ($000)</font></td>
  </tr>
</table>
</center></div>

<hr align="center">

<p><b><font color="#800040" size="5">Step 4: Discuss the Results and Summarize your
Findings<br>
</font><font color="#000000" size="3">Similar to the presentation in the <a
href="simple_reg.htm#Simple Regression">simple regression</a> tutorial, I will discuss the
results in the order in which SPSS/win generates the outputs beginning with the
descriptive statistics tables.&nbsp; This approach permits a critical analysis of the all
results and their implications. <br>
</font></b><br>
<strong><font color="#800040" size="4">I. Descriptive Statistics</font><font
color="#800040" size="5"><br>
</font>1.</strong> <font color="#FF0000"><strong>Annual Food Expenditure</strong></font> <br>
<strong>a)</strong> The <strong>sample mean</strong> is 7.965 thousands of dollars. This
means that an average family in the sample spends $7965 annually on food.<br>
<br>
<strong>b)</strong> The <strong>sample standard deviation</strong> of 4.664 (thousands of
dollars) is equivalent to a one-standard deviation of $4660 about the mean values of
$7965. This implies that 68.3% of the families spend between $3305 and $12,625 annually on
food. <br>
<br>
<strong>2</strong>. <strong><font color="#FF0000">Annual Income</font> <br>
a)</strong> The <strong>sample mean</strong> is 45.50 thousands of dollars. In terms of
income, this implies that an average family in the sample makes $45,500 annually.<br>
<br>
<strong>b)</strong> The <strong>sample standard deviation</strong> of 23.96 (thousands of
dollars) is equivalents to ±$23,960 about the mean income of $45,500. Thus, 68.3% of the
families could be said to make between $21,540 and $69,460 annually.<br>
<br>
3. <font color="#FF0000"><strong>Family Size (measured in terms of the number people in a
family during a year) </strong></font><br>
a) The sample mean of 2.95 means that an average family comprised of about 3 persons
during&nbsp; the year.<br>
<br>
b) The sample standard deviation of 1.61 (or 2) means that&nbsp; there were between 1 and
5 members in approximately 68.3% of the families during the year.<br>
&nbsp;&nbsp;&nbsp;&nbsp; <br>
<strong>4</strong>. <strong><font color="#FF0000">Sample size</font> N (actually 'n' ) =
20</strong> simply means that there is no missing value during estimation.<br>
</p>

<hr align="center">

<p><strong><font color="#800040" size="4">II. Correlations Analysis</font><font
color="#800040" size="3"> </font><font color="#800040" size="4"><br>
</font></strong>This table contains the <a href="glossary.htm#Correlation Coefficient"><strong>Pearson
sample correlation coefficients</strong></a> of variable <font color="#FF0000"><strong>i</strong></font>
with variable <font color="#FF0000"><strong>j</strong></font> ( denoted as <strong><font
color="#FF0000">r<sub>i,j</font> </sub></strong>), which are the key tools of <a
href="glossary.htm#Correlation Analysis"><strong>Correlation Analysis</strong></a>. This
is the same <strong>Karl Pearson</strong> that I mentioned in the historical footnote
under the discussion of the Chi-square test of Independence (and also, in glossary under <a
href="glossary.htm#Regression Analysis"><strong>regression analysis</strong></a>). Let us
focus for now on the top part of the table. <strong>It is a 3 by 3 matrix</strong>. The
following conclusions are obvious: <br>
<br>
<strong>1</strong>. <strong><font color="#FF0000">The correlation of annual Food
Expenditure with itself</font> </strong>is perfect, linear, and direct since <strong><font
size="4">r</font><sub><font size="1">y,y</font></sub> = 1.000.</strong> Similar
interpretations apply to Income (<strong><font size="4">r</font><sub><font size="1">1,1</font></sub></strong>
= 1) and family Size (<strong><font size="4">r</font><sub><font size="1">2,2</font></sub></strong>
= 1).<br>
<br>
<strong>2</strong>. <font color="#FF0000"><strong>The correlation of annual Food
expenditure with Income</strong></font> is quite strong, linear and direct because<strong>
<font size="4">r</font><sub><font size="1">y,1</font></sub> = .946 </strong><br>
<br>
<strong><font color="#000000">3.</font><font color="#FF0000"> The correlation of annual
Food expenditure with Family Size</font></strong> is relatively strong, linear and direct
because<strong> <font size="4">r</font><font size="1"><sub>y,2</sub></font> = .787 </strong><br>
<br>
<strong><font color="#000000">4.</font><font color="#FF0000"> The correlation of annual
Income with Family Size </font></strong>is also strong (albeit undesirable), linear and
direct because<strong> <font size="4">r</font><font size="1"><sub>1,2</sub></font> = .676 </strong><br>
<br>
<strong>5</strong>. <font color="#FF0000"><strong>The 3 x 3 matrix is symmetric about the
main diagonal</strong></font>; hence, all the information about the type and strength of
relationship between the two variables can be obtained from the correlation coefficients
either above the main diagonal or below it.<br>
<br>
<strong>4</strong>. The middle portion of the table contains the <strong><font
color="#FF0000">p-values (sig=significance</font><font color="#0000FF"> </font><font
color="#000000">for a two-tailed test that H<sub>o</sub>: P<sub>i,j</sub> = 0 against H<sub>a</sub>:
P<sub>i,j</sub> <font face="Times New Roman">≠</font> 0&nbsp; (for i not equal to j); where P (rho) is the population
correlation coefficient</strong> whose value is unknown</font>). The <strong>probability
or p-values (i.e.; computed/observed values or alpha<sub>ov</sub> ) of .000</strong> means
that <strong>H<sub>o</sub></strong> can be rejected unequivocally at the critical level of
<strong>alpha = .01</strong>. Thus, the conclusions in (1) through (4) above are indeed
valid. <br>
<br>
<strong>5</strong>. Again, <font color="#FF0000"><strong>N (i.e. 'n' ) = 20</strong></font>
since all the observations were used in the estimation. <br>
</p>

<hr align="center">

<p><font color="#800040" size="4"><strong>III. Model Summary and Evaluation with S<sub>e</sub>,
R, R<sup>2</sup>, and DW Statistics<br>
</strong></font>From the 'Coefficients' table, the OLS method produces the following
estimated SRP:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<blink> <font color="#0000FF"><strong>ý<sub>i</sub></strong>
<strong>= -1.118 + .148Xi<sub>i,1</sub> +
.793X<sub>i,2</sub> </strong></font> </blink><br>
From the 'Model Summary' table <font color="#0000FF"><strong>the following summary
statistics are reported R = .967, R<sup>2</sup>= .935, Adjusted R<sup>2</sup> = .927,
&nbsp; S<sub>e</sub> = 1.261</strong></font> and Durbin Watson (DW) statistic = <font
color="#0000FF"><strong>2.616</strong></font>.&nbsp; Let us explore their implications for
the accuracy of the estimated SRP. <br>
<br>
<strong>1</strong>. <strong><font color="#FF0000">The sample multiple correlation
coefficient</font><font color="#0000FF"> R =.967</font></strong> measures the degree of
relationship between the actual values (<font color="#0000FF"><strong>y<sub>i</sub></strong></font>)
and the predicted values (<font color="#0000FF"><strong>ý<sub>i</sub></strong></font>) of
the annual family food expenditure. Because the <font color="#0000FF"><strong>ý<sub>i</sub></strong></font>
values are obtained as a linear combination of Income (<font color="#0000FF"><strong>X<sub>1</sub></strong></font>)
and family Size (<font color="#0000FF"><strong>X<sub>2</sub></strong></font>), the
coefficient value of&nbsp; <font color="#0000FF"><strong>.967</strong></font> indicates
that the relationship between family food expenditure and the two IVs is quite strong and
positive.<br>
<br>
<strong>2</strong>. <strong><font color="#FF0000">The sample Coefficient of Determination</font></strong>
<font color="#0000FF"><strong>R-square or R<sup>2</sup></strong></font> (<strong><font
color="#0000FF">r<sup>2</sup></font> is commonly used in simple regression analysis while
R<sup>2</sup> is appropriately reserved&nbsp; for multiple regression analysis)</strong>.
It measures the goodness-of-fit of the estimated SRP in terms of the proportion of the
variation in the DV explained by the fitted sample regression equation or SRP. Thus, the
value of <strong><font color="#0000FF">R<sup>2</sup> = .935</font></strong> simply means
that about <strong>94%</strong> of the variation in annual Family Food Expenditure is
explained or accounted for by the estimated SRP that uses Income and family Size as the
IVs.&nbsp; This information is quite useful in assessing the overall accuracy of the
model. Notice that <b><font color="#0000FF">R<sup>2</sup></font></b> = <font
color="#0000FF"><b>(R = .967</b><strong>)<sup>2</sup></strong></font>. <b><br>
<br>
</b><strong>3</strong>. <strong><font color="#FF0000">Adjusted R-Square (or R<sup>2</sup>
with a bar over it)</font></strong> is the sample <strong>Coefficient of Determination</strong>
after adjusting for the degrees of freedom lost in the process of estimating the
regression parameters. In this case, three parameters <font color="#800040"><strong>A</strong></font>
and <font color="#800040"><strong>B<sub>1</sub> and B<sub>2</sub> </strong></font>were
estimated so that three degrees of freedom (df) have been lost; thus, the remaining df can
be determined as <b>v = n -k</b> where K denotes the number of parameters in the LRM.
&nbsp; Hence, the adjusted <strong>R-square</strong> is a better measure of the
goodness-of-fit of the estimated SRP than its nominal/unadjusted counterpart. It is always
smaller in value than the unadjusted. I will examine the adjusted coefficient of
determination in some details in Econs. 853 and 976. <br>
<br>
<strong>4</strong>. <strong><font color="#FF0000">Standard Error of the Estimate (standard
notation is S<sub>e</sub>)</font></strong>. This summary statistic measures the overall
accuracy or quality of the estimated SRP in terms of the average/standardized unexplained
variation in the DV that may be due to possible errors that could originate from <strong>(i)</strong> 
chance errors of sampling or sampling errors, thereby causing the values of ‘a’ 
and ‘b’ to differ significantly from the true but unknown values of the 
parameters ‘<font color="#800040"><strong>A</strong></font>’ and ‘<font
color="#800040"><strong>B</strong></font>’; and <strong>(ii)</strong> possible
variation in the parameter which , according to the Classical Assumption, are presumed
constant. If these errors are small, on average, then the value of <strong>S<sub>e</sub></strong>
could approach zero (exactly equal to zero if the estimated values of the DV, denoted here
as <strong><font color="#0000FF" size="4">ý</font><font color="#0000FF" size="1"><sub>i</sub></font></strong>
equals their actual/observed counterparts <strong><font color="#0000FF" size="4">y</font><font
color="#0000FF"><sub>i</sub></font></strong> <strong>for all i = 1, 2, ..., n</strong>).
If otherwise, the values of <strong>S<sub>e</sub></strong> approach +infinity; in which
case the estimated SRP must be considered useless especially if application involves the
prediction of the DV outside the sample period. Note that <strong>S<sub>e</sub></strong>
is an unbiased estimator of the <font color="#0000FF"><strong>standard deviation</strong></font>
around the true conditional PRP <strong><font size="4">µ</font><font color="#000000"><sub>y/x</sub></font>
= A + B<sub>1</sub>X<sub>i,1</sub></strong> <strong>+</strong>  <strong>B<sub>2</sub>X<sub>i,2</sub></strong>
which is denoted as <strong><font color="#0000FF" size="2">Ó</font><font color="#0000FF"><sub>y/x<br>
<br>
</sub></font></strong>In this example, <strong><font color="#0000FF">S<sub>e</sub> = 1.261</font></strong>
means that, on average, the predicted values of the annual family <strong>Food expenditure
could vary by ±$1261 about the estimated regression equation for each value of the Income
and Family size during the sample period -- and by a much larger amount outside the sample
period</strong>. This is why prediction outside the sample period requires the use of the <strong><font
color="#0000FF">standard errors of the estimators ‘a’, ‘b<sub>1</sub>’ and ‘b<sub>2</sub>’ 
(denoted, respectively, as S<sub>a</sub>, S<sub>b1</sub>,
and S<sub>b2</sub>)</font></strong> for establishing confidence intervals about the
condition mean values <strong><font size="4">µ</font><font color="#000000"><sub>y/x</sub></font></strong>.
Note that<strong> S<sub>a</sub>, S<sub>b1</sub>and S<sub>b2</sub> take into account the
chance errors of sampling mentioned earlier. Accounting for parameter variation will
require the application of advanced econometric techniques which is beyond the scope of
the undergraduate material. </strong><br>
<br>
<strong>5</strong>. <font color="#FF0000"><strong>Durbin Watson (DW) Statistics</strong></font>
measures the presence, or lack thereof, of <font color="#800040"><strong>Serial
Correlation (also known as Autocorrelation)</strong></font> among the errors from one
observation (or time period) to other observations (or time periods). Details about the
implications of the existence of the autocorrelation will be examined in Econs 853 and 976
classes. For now, suffice it to say that a value of <font color="#0000FF"><strong>DW =
2.616 </strong></font>means that the residuals <font color="#0000FF"><strong>é</strong> <strong>=</strong></font>
<strong><font color="#0000FF" size="4">y</font><font color="#0000FF"><sub>i</sub></font></strong>
<font color="#0000FF">-</font> <strong><font color="#0000FF" size="4">ý</font><font
color="#0000FF" size="1"><sub>i</sub></font></strong> <strong>(for all i = 1, 2, ..., n) </strong>from
the estimated regression model are negatively correlated and strongly so -- suggesting the
presence of a positive autocorrelation in the error terms (E<sub>i</sub>).&nbsp; According
to the Classical Assumptions, this is undesirable.&nbsp; The ideal value of the DW
statistic should be <font color="#0000FF"><strong>2.00</strong></font> to indicate the
absence of autocorrelation. Again, detail discussion of autocorrelation will be presented
in Econs. 856 &amp; 976.</p>

<hr align="center">

<p><strong><font size="4" color="#800040">IV. ANOVA Table: Testing the Significance of the
Model</font><font color="#800040" size="5"><br>
</font></strong>The summary measures reported here are used in the partitioning of the the
total variation in the DV according to the identity relation <font color="#0000FF"><strong>TSS
= ESS + RSS</strong></font>, where TSS is the Total Sum of Squares in the DV, ESS is the
Explained Sum of Squares due to the fitted regression equation or model, and RSS is the
Residual (remaining) Sum of Squares that is unexplained and hence attributable to errors
(i.e.; chance sampling errors, and those resulting from parameter invariance). Note the
following: (1) The smaller RSS is relative to the TSS, (or the larger ESS is relative to
TSS), the better the estimated regression equation fits the data. (2) The underlying
principle in the partition of TSS is similar to that of the ANOVA technique.&nbsp; As in
that technique, the identity relation carries over to the associated degrees of freedom in
the this manner <strong><font color="#0000FF">v = v<sub>1</sub> + v<sub>2</sub></font></strong>
where <strong><font color="#0000FF">v<sub>1</sub> = k-1</font></strong>, and <strong><font
color="#0000FF">v<sub>2</sub></font></strong> <font color="#0000FF"><strong>= n-k</strong></font>
so that <font color="#0000FF"><strong>v = n -1</strong></font>; where <font
color="#0000FF"><strong>k</strong></font> is denotes the number of parameters that are
estimated. (3) If <font color="#0000FF"><strong>k</strong></font> is defined as the number
of IVs in the model, then <strong><font color="#0000FF">v<sub>1</sub></font></strong> <font
color="#0000FF"><strong>= k</strong></font>, and <strong><font color="#0000FF">v<sub>2</sub></font></strong>
<font color="#0000FF"><strong>= n-k-1</strong></font>; again, <strong><font
color="#0000FF">v = v<sub>1</sub> + v<sub>2</sub> = n -1</font></strong>.<br>
<br>
<strong><blink><font color="#FF0000">Caution</font></blink></strong>: Some authors use RSS
(regression sum of squares) instead of ESS (explained sum of squares), and ESS (error sum
of squares) instead of RSS (residual sum of squares) so that the identity is stated as TSS
= RSS + ESS.&nbsp; So pay attention to how these acronyms are defined.<br>
<br>
The null hypothesis (<strong><font color="#0000FF">H<sub>o</sub></font></strong>) to
verify is that all of the IVs in the model, considered together, have no causal effect on
the DV; in which case the LRM that relates these IVs to the DV does nor exist. The
alternative hypothesis (<strong><font color="#0000FF">H<sub>a</sub></font></strong>) is
that that is not the case; indeed one, if not all, of the IVs significantly influences the
DV. The formats of both <strong><font color="#0000FF">H<sub>o</sub></font></strong> and <strong><font
color="#0000FF">H<sub>a</sub></font></strong> are:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<strong><font color="#0000FF">H<sub>o</sub>: B<sub>1</sub> = B<sub>2</sub> = 0</font></strong>
against <strong><font color="#0000FF">H<sub>a</sub></font></strong>: They not are all
equal to zero; at least one is nonzero<br>
<br>
From the ANOVA table, under the df column, <strong><font color="#0000FF">v<sub>1</sub> =
2, v<sub>2</sub> = 17</font></strong>, <strong><font color="#0000FF">v = 19, and F<sub>ov</sub></font></strong>
<font color="#0000FF"><strong>= 121.470</strong></font>. Using the significance level of
.05, implies the critical F-value or <strong><font color="#0000FF">F<sub>cv </sub>= F<sub>.<small>05,
2, 17</small></sub> = 3.59</font></strong> from the F distribution table.&nbsp; Thus, we
can reject <strong><font color="#0000FF">H<sub>o</sub></font></strong> in favor of <strong><font
color="#0000FF">H<sub>a</sub></font></strong>. This means that the LRM that has been
estimated is not a mere theoretical construct; indeed it does exist and is statistically
significant. </p>

<hr align="center">

<p><font size="4" color="#800040"><strong>V. Coefficients Table: T-Test of the
Significance of the Regression Coefficients</strong></font><small> </small><br>
This table contains the estimated regression coefficients (a = -1.118, b<sub>1</sub> =
.148, and b<sub>2</sub> = .973); hence, the estimated SRP/equation can be written as
&nbsp; <blink><strong><font color="#0000FF" size="4">ý</font><font color="#0000FF"
size="1"><sub>i</sub></font></strong><font color="#0000FF" size="4">  <strong>= -1.118 +
.148</strong></font><strong><font color="#0000FF" size="3">X</font><font color="#0000FF" size="5"><sub>i,1</sub></font><font
color="#0000FF" size="4"> + .793</font><font color="#0000FF" size="3">X</font><font
color="#0000FF" size="5"><sub>i,2</sub></font></strong></blink>. The estimated coefficients
have the following interpretations: <br>
<br>
1.<font color="#FF0000"><strong> a = -1.118</strong></font> has no interpretable meaning
because the average level of family Food expenditure could not be negative even when no
member of the is gainfully employed. Moreover, it is unrealistic to think of the existence
a family that has no income and member and yet incurs expenditure on food.&nbsp;
Nonetheless, this value should not be discarded; it plays an important role when using the
estimated regression line/equation for prediction. <br>
<br>
2. <font color="#FF0000"><strong>b<sub>1</sub> = .148</strong></font> represents the
partial effect of annual family Income on Food Expenditure, holding family Size constant.
The estimated positive sign implies that such effect is positive while the absolute value
implies that Food Expenditure would increase by $148 for every $1000 increase in Income. <br>
<br>
3. <font color="#FF0000"><strong>b<sub>2</sub> = .793</strong></font> represents the
partial effect of family Size on Food Expenditure, holding family Income constant. The
estimated positive sign implies that such effect is positive while the absolute value
implies that Food Expenditure would increase by $793 for every additional member to the
family either by marriage, birth or adoption. Note that the addition to a family by
marriage is a possibility because there were some families in the sample with only one
person. <br>
<br>
4. <font color="#FF0000"><strong>Standard errors of the estimators</strong></font>: <strong>Assessing
the precision of 'a', 'b<sub>1</sub>', and 'b<sub>2</sub>'</strong><br>
<strong><font color="#0000FF">S</font><font color="#0000FF" size="1"><sub>a</sub></font><font
color="#0000FF" size="4"> = .</font><font color="#0000FF" size="3">655, S</font><font
color="#0000FF" size="2"><sub>b1</sub></font></strong><font color="#0000FF"> <strong>=
.016</strong></font>, <strong><font color="#0000FF" size="3">and S</font><font
color="#0000FF" size="2"><sub>b2</sub></font></strong><font color="#0000FF"> <strong>=
.244, </strong></font>respectively, measure the precision of the estimated values of <font
color="#0000FF"><strong>a = -1.118, <font size="3">b</font><font size="2"><sub>1</sub></font>
= .148, and <font size="3">b</font><font size="2"><sub>2</sub></font> = .793</strong></font>,
in taking on or estimating the true but unknown values of the corresponding regression
parameters <font color="#800040"><strong>A</strong></font> and <strong><font
color="#800040">B<sub>1</sub></font></strong> and <font color="#800040"><strong>B<sub>2</sub></strong></font>.
The closer the values of <strong><font color="#0000FF">S</font><font color="#0000FF"
size="1"><sub>a</sub></font><font color="#0000FF" size="4">, </font><font color="#0000FF"
size="3">S</font><font color="#0000FF" size="2"><sub>b1</sub></font></strong>, <strong><font
color="#0000FF" size="3">and S</font><font color="#0000FF" size="2"><sub>b2</sub></font></strong><font
color="#0000FF"> </font>to zero, the higher the precision of the estimates, suggesting
that chance errors due to sampling is not severe. The converse would suggest the opposite.
Thus <strong><font color="#0000FF" size="3">S</font><font color="#0000FF" size="2"><sub>b</sub></font></strong><font
color="#0000FF"> <strong>= .016</strong></font> implies that <strong><font color="#0000FF"><font
size="3">b</font><font size="2"><sub>1</sub></font></font><font color="#0000FF" size="4">
= .148</font><font color="#0000FF"> </font></strong>is much more closer to the true value
of <strong><font color="#800040">B<sub>1</sub> </font></strong><font color="#000000">than
is </font><strong><font color="#0000FF"><font size="3">b</font><font size="2"><sub>2</sub></font>
= .793</font><font color="#800040"> </font></strong><font color="#000000">to<strong> </font><font
color="#800040">B<sub>2</sub></font></strong>; and <strong><font color="#0000FF">S</font><font
color="#0000FF" size="1"><sub>a</sub></font><font color="#0000FF" size="4"> = .</font><font
color="#0000FF" size="3">655</font></strong> implies quite the opposite coupled with the
fact the estimated sign contradicts commonsense or reality. <br>
<br>
5. <font color="#FF0000">S<strong>tandardized Coefficients</strong></font>: <strong>Assessing
the Relative Importance of the IVs</strong><br>
The standardized coefficients are useful for determining the relative importance of the
IVs the model. In effect, the importance of IVs can ranked according to the size (i.e.,
the absolute value) of the beta coefficients.&nbsp; In this example, the beta coefficient
for income <font color="#0000FF"><strong><font size="3">b<sup>*</sup></font><font size="2"><sub>1</sub></font></strong></font>=
<strong><font color="#0000FF" size="3">.148</font><font color="#0000FF"> (23.96/4.66) =
.762</font></strong> (under the &quot;Beta&quot; column), where 23.96, and 4.66 are the
sample standard deviation of family Income and Food Expenditure, respectively.&nbsp; The
beta coefficient for family Size is <strong><font color="#0000FF" size="3">b<sup>*</sup></font><font
size="2" color="#0000FF"><sub>2</sub></font></strong> <strong><font color="#0000FF">=
.793(1.61/4.66) = .273</font></strong>, where 1.61 is the sample standard deviation of the
family Size variable.&nbsp; Thus the estimated SRP can be expressed in terms of the beta
coefficients as <font color="#0000FF"><strong>ý<sub>i</sub></strong> <strong>=</strong></font> <strong><font
color="#0000FF" size="4"><blink> </blink></font><font color="#0000FF">.762X<sub>i,1</sub> <font
size="4">+</font></font><font color="#0000FF" size="4"><blink> </blink></font><font
color="#0000FF">.273X<sub>i,2</sub></font></strong>. Because the absolute value of the beta
coefficient for income is larger, it can be concluded that income is relatively a more
important predictor of family food expenditure than the size of the family.&nbsp;&nbsp; <br>
<br>
Suppose we had included a third IV (<font color="#0000FF"><strong>X<sub>3</sub></strong></font>,
say, the local price level for each family assuming families were randomly selected from a
national pool) and came up with an estimated beta coefficient of -.825.&nbsp; then the
ranking of the IVs according their relative importance in predicting/explaining family
food expenditure would be as follows: 1 for <font color="#0000FF"><strong>X<sub>3</sub></strong></font>,
2 for <font color="#0000FF"><strong>X<sub>1</sub></strong></font>, and 3 for <font
color="#0000FF"><strong>X<sub>2</sub></strong></font> .&nbsp; <br>
<br>
<strong>6</strong>. <font color="#FF0000">O<strong>bserved/computed t statistic</strong></font>
(<strong><font color="#0000FF">t</font><font color="#0000FF" size="1"><sub>ov</sub></font></strong>):
<strong>T-test of the Significance and Signs of the
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
Regression Parameters.</strong><br>
As part of investigating the accuracy of the fitted SRP, it is often useful to verify both
the statistical significance and the sign (i.e., economic significance) of the regression
parameters/coefficients (B<sub>1</sub>, B<sub>2</sub>) individually.&nbsp; For statistical
significance, the maintained hypothesis is that the IV or X<sub>j</sub> has no causal
effect on the DV or Y. Thus, the null is <strong><font size="2" color="#0000FF">H</font><font
color="#0000FF" size="1"><sub>0</sub></font><font color="#0000FF" size="3">: B<sub>j</sub>
= 0 </font></strong>(i.e., X<sub>j</sub> has no causal effect on the DV) against the
alternative that <strong><font size="2" color="#0000FF">H</font><font color="#0000FF"
size="1"><sub>a</sub></font><font color="#0000FF" size="3">: B<sub>j</sub> is not equal to
zero</font></strong> (i.e., X<sub>j</sub> does indeed have some causal effect on the DV;
such effect may be direct or indirect). <br>
<br>
<b>a. Testing for Statistical Significance of B<sub>j </sub></b>
<br>
With respect to income, the null is <strong><font size="2" color="#0000FF">H</font><font
color="#0000FF" size="1"><sub>0</sub></font><font color="#0000FF" size="3">: B<sub>1</sub>
= 0 </font></strong>(i.e., Income has no causal effect on Food Expenditure), against the
alternative that <strong><font size="2" color="#0000FF">H</font><font color="#0000FF"
size="1"><sub>a</sub></font><font color="#0000FF" size="3">: B<sub>1</sub> is not equal to
zero</font></strong> (i.e., income indeed does have some causal effect on food
expenditure). &nbsp; For the Family Size, the null is <strong><font size="2"
color="#0000FF">H</font><font color="#0000FF" size="1"><sub>0</sub></font><font
color="#0000FF" size="3">: B<sub>2</sub> = 0 </font></strong>(i.e., Family Size has no
causal effect on food expenditure), against the alternative that <strong><font size="2"
color="#0000FF">H</font><font color="#0000FF" size="1"><sub>a</sub></font><font
color="#0000FF" size="3">: B<sub>2</sub> is not equal to zero</font></strong> (i.e.,
Family Size indeed does have some causal effect on food expenditure).&nbsp; For alpha =
.05 and v = n -k-1 = 20 -2-1 = 17, this implies a critical t-value of <strong><font
color="#0000FF">t<sub>cv</sub></font><font color="#0000FF" size="4"> = </font><font
color="#0000FF">t<sub>.</font><font size="2" color="#0000FF">025,17</font><font
color="#0000FF"> </font></sub><font color="#0000FF" size="4">= ±</font><font
color="#0000FF" size="3">2.110</font></strong>. For Income, <strong><font color="#0000FF">t</font><font
color="#0000FF" size="1"><sub>ov</sub></font><font color="#0000FF" size="4"> = </font><font
color="#0000FF" size="3">9.049.</font><font color="#0000FF" size="4"> </font></strong>Thus,
<strong><font color="#0000FF">H<sub>o </sub></font></strong>must unequivocally be rejected
in favor of <strong><font color="#0000FF">H<sub>a</sub></font></strong>; in which case,
family Income can be said to have a significant influence on family Food Expenditure.<strong><font
color="#0000FF" size="3"> </font></strong><font size="3" color="#000000">For family Size, </font><strong><font
color="#0000FF">t</font><font color="#0000FF" size="1"><sub>ov</sub></font><font
color="#0000FF" size="4"> = </font><font color="#0000FF" size="3">3.245</font></strong>.
So, <strong><font color="#0000FF">H<sub>o </sub></font></strong>must be rejected in favor
of <strong><font color="#0000FF">H<sub>a</sub></font></strong>; in which case, family Size
can be said to have a significant influence on family Food Expenditure.<br>
<br>
<b>b. Testing for Economic/practical Significance of B<sub>j </sub></b>
<br>
An interesting variation of the t-test is to verify the economic significance of the
parameter with respect to the direction of causality of the associated IV.&nbsp; In this
case, the null is phrased as <strong><font color="#0000FF">H<sub>0</sub></font></strong>: <strong><font
color="#0000FF" size="3">B<sub>j</sub></font></strong><font color="#0000FF"> <strong>has a
value that is at the most zero</strong></font>, against <strong><font color="#0000FF">H<sub>a</sub></font></strong>:
<strong><font color="#0000FF" size="3">B<sub>j</sub></font><font color="#0000FF"> &gt; 0</font></strong>
(i.e; its value is strictly positive according to the underlying economic theory). If the
sign of the parameter was expected to be negative on the basis of theory or common sense,
then the null is phrased as <strong><font color="#0000FF">H<sub>0</sub></font></strong>: <strong><font
color="#0000FF" size="3">B<sub>j</sub></font></strong><font color="#0000FF"> <strong>has a
value that is at the least zero</strong></font>, against <strong><font color="#0000FF">H<sub>a</sub></font></strong>:
<strong><font color="#0000FF" size="3">B<sub>j</sub></font><font color="#0000FF"> &lt; 0</font></strong>
(i.e; its value is strictly negative according to the underlying economic theory). <br>
<br>
Consider, for example, family size where the sign of <font color="#0000FF" size="3"><strong>B<sub>2</sub></strong></font>
is expected to be positive. <strong><font color="#0000FF">H<sub>0</sub></font></strong>: <strong><font
color="#0000FF" size="3">B<sub>2</sub></font><font color="#0000FF"> </strong>has a value
that is at the most zero</font> against <strong><font color="#0000FF">H<sub>a</sub></font></strong>:
<strong><font color="#0000FF" size="3">B<sub>2</sub></font><font color="#0000FF"> &gt; 0</font></strong>.
&nbsp; At the level of alpha = .05, the critical t-value is <strong><font color="#0000FF">t<sub>cv</sub></font><font
color="#0000FF" size="4"> = </font><font color="#0000FF">t<sub>.</font><font
color="#0000FF" size="2">05,17</font></sub></strong> <font color="#0000FF"><strong>=
+1.740</strong></font>. But the <strong><font color="#0000FF">t<sub>ov</sub></font><font
color="#0000FF" size="4"> = </font><font color="#0000FF" size="3">3.245</font><font
color="#0000FF" size="2"><sub> </sub></font></strong>, thus <strong><font color="#800040"
size="2">H<sub>o</sub></font><font color="#800040" size="3"> of negative or no effect of
family Size must be rejected</font><font color="#800040" size="4"> </font><font
color="#800040" size="3">unequivocally.</font></strong> <br>
<br>
Note that in the test for economic significance of a parameter the alpha value is not
divided by two since this is always a one-tailed test; whereas, it is divided by 2 in the
test for statistical significance since this is always a two-tailed test.<br>
<br>
<strong>7</strong>. <strong><font color="#FF0000">Prediction --using the estimated SRP </font></strong><br>
Suppose a typical or <b><font color="#0000FF">i<sup>th </sup></font></b>family drawn from
the same population had an annual Income of $30,000 in 1993 with a family size of 2
members (this is the 8<sup>th</sup> family in our <a href="scattergram.htm#Scattergram"><font
size="3">sample</font></a>). Its estimated/predicted annual Food Expenditure,
corresponding <strong><font color="#0000FF" size="3">X</font><font color="#0000FF"
size="2"><sub>1,8</sub></font></strong> <font color="#0000FF"><strong><small>= </small><font
size="3">$30</font></strong></font> and&nbsp; <strong><font color="#0000FF" size="3">X</font><font
color="#0000FF" size="2"><sub>2,8</sub></font></strong> <font color="#0000FF"><strong>= <font
size="4">2</font></strong></font> would be <strong><font color="#0000FF" size="4">ý</font><font
color="#0000FF" size="1"><sub>i</sub></font><font color="#0000FF" size="4"> =</font><font
color="#0000FF" size="3"> -1.118 + .148</font><font color="#0000FF" size="1"> x </font><font
color="#0000FF" size="3">30 + .793 </font><font color="#0000FF" size="1">x</font><font
color="#0000FF" size="3"> 2 = 4.908</font></strong> thousands of dollars. Thus, $4908 is
the best estimate of the average annual Food Expenditure for this family.&nbsp; But this
family actually spent 5.8 thousands of dollars or $5800. Hence, the positive residual of
$892 (i.e., e<small><sub>8</sub></small> = 5800-4908) is the amount by which the estimated
SRP has underpredicted the annual Food Expenditure for this family. </p>

<p><a href="#top"><font size="4"><strong>Top</strong></font></a> <b>or Return to <a
href="regression.htm#REGRESSION">Regression &amp; Correlation Analysis</a> or <a
href="learning.htm#Learning Statistics with SPSS/win">Learning Statistics with SPSS/win</a></b>
<b>or <a href="index.htm#Home Page">Home Page</a> or Send me your <a
href="mailto:eeusip@cc.ysu.edu">Comments via E-mail</a>.</b><br>
</p>

<hr align="center">

<p><small><strong>Copyright© 1996, Ebenge Usip, all rights reserved.<br>
Last revised: 
<!--webbot bot="Timestamp" S-Type="EDITED"
S-Format="%A, %B %d, %Y" startspan -->Wednesday, July 10, 2013<!--webbot bot="Timestamp" i-checksum="56355" endspan -->.</strong></small><font
size="1"><br wp="br1">
</font></p>
</body>
</html>