﻿ Simple Correlation Analysis: Graphical Analysis, Dr. Usip, Economics

Simple Correlation Analysis: Graphical Analysis

Problem Description & Data
In this example a scatterplot diagram is used to explore the type and degree of linear relationship between two quantitative variables - Sales Volume in \$100 (denoted as Y) and the No. of Commercials (denoted as X) of a stereo and equipment store. Basic research questions include the determination of:

1. The type of relationship between the two variables - positive (direct) or negative (indirect).

2. The degree of the relationship, if any.

This example is taken from Anderson, Sweeney & Williams (1999, 7th ed.), p. 46. The data set is given as follows:

X: 2, 5, 1, 3, 4, 1, 5, 3, 4, 2

Y: 50, 57, 41, 54, 54, 38, 63, 48, 59, 46

Using the command sequence SPSS/win generates the following output:

The following conclusions are apparent from the scattergram:

1. The relationship is positive/direct: more commercials implies more sales.

2. The relationship is linear: the true relationship appears to be linear and can obviously be estimated by fitting a straight line to the data.

3. The relationship appears to be strong: the dots tend to cluster about the line of true relationship.

In general, the following statements (albeit, not exhaustive) describing possible types and degrees of relationship between and two variables can be depicted graphically with the aid of a scatterplot:

1. The relationship is perfectly strong, linear and direct.
2. The relationship is extremely strong, linear, and direct.
3. The relationship is strong, linear, and direct.
4. The relationship is weak, linear, and direct.
5. There is no apparent relationship between the two variables.
6. The relationship is weak, linear, and indirect.
7. The relationship is strong, linear, and indirect.
8. The relationship is extremely strong, linear and indirect.
9. The relationship is perfectly strong, linear, and indirect.

The types and degrees of a linear relationship represented by the above statements can all be quantified by using the correlation formula to compute a summary measure called the sample correlation coefficient (denoted simply as 'r'). This will be discussed shortly; but meanwhile let us examine how the scattergram can be used as an exploratory tool in the preliminary phase of the data analysis (e.g, simple regression analysis). One important aspect of a bivariate statistical analysis is to uncover the type of relationship inherent in the data. Once that is done the next important phase is to model the relationship, and estimate/quantify the relationship before applying the estimated model for decision making. The regression example illustrates the use of a scattergram as a preliminary analytical device.