# Correlation

by Saul McLeod published 2008

Correlation means association - more precisely it is a measure of the extent to which two variables are related.

If an increase in one variable tends to be associated with an increase in the other then this is known as a positive correlation. An example would be height and weight. Taller people tend to be heavier.

If an increase in one variable tends to be associated with a decrease in the other then this is known as a negative correlation. An example would be height above sea level and temperature. As you climb the mountain (increase in height) it gets colder (decrease in temperature).

When there is no relationship between two variables this is known as a zero correlation. For example their is no relationship between the amount of tea drunk and level of intelligence.

A correlation can be expressed visually. This is done by drawing a scattergram - that is one can plot the figures for one variable against the figures for the other on a graph.

When you draw a scattergram it doesn't matter which variable goes on the x-axis and which goes on the y-axis. Remember, in correlations we are always dealing with paired scores, so the values of the 2 variables taken together will be used to make the diagram. Decide which variable goes on each axis and then simply put a cross at the point where the 2 values coincide.

Strictly speaking correlation is not a research method but a way of analysing data gathered by other means. This might be useful, for example, if we wanted to know if there were an association between watching violence on T.V. and a tendency towards violent behaviour in adolescence (Variable B = number of incidents of violent behaviour observed by teachers).

Another area where correlation is widely used is in the study of intelligence where research has been carried out to test the strength of the association between the I.Q. levels of identical and non-identical twins.

## Correlation Coefficients

Instead of drawing a scattergram a correlation can be expressed numerically as a coefficient, ranging from -1 to +1.

The correlation coefficient (*r*) indicates the extent to which the pairs of numbers for these two variables lie on a straight line. Values over zero indicate a positive correlation, while values under zero indicate a negative correlation.

## Differences between Experiments & Correlations

An experiment isolates and manipulates the independent variable to observe its effect on the dependent variable, and controls the environment in order that extraneous variables may be eliminated. Experiments establish **cause and effect**.

A correlation identifies variables and looks for a **relationship** between them.

An experiment tests the effect that an independent variable has upon a dependent variable but a correlation looks for a relationship between two variables. This means that the experiment can predict cause and effect (causation) but a correlation can only predict a relationship, as another extraneous variable may be involved that it not known about.

## Strengths of Correlations

**1**. Correlation allows the researcher to investigate naturally occurring variables that maybe unethical or impractical to test experimentally. For example, it would be unethical to conduct an experiment on whether smoking causes lung cancer.

**2**. Correlation allows the researcher to clearly and easily see if there is a relationship between variables. This can then be displayed in a graphical form.

## Limitations of Correlations

**1**. Correlation is not and cannot be taken to imply causation. Even if there is a very strong association between two variables we cannot assume that one causes the other.

For example suppose we found a positive correlation between watching violence on T.V. and violent behavior in adolescence. It could be that the cause of both these is a third (extraneous) variable - say for example, growing up in a violent home - and that both the watching of T.V. and the violent behavior are the outcome of this.

**2**. Correlation does not allow us to go beyond the data that is given. For example suppose it was found that there was an association between time spent on homework (1/2 hour to 3 hours) and number of G.C.S.E. passes (1 to 6). It would not be legitimate to infer from this that spending 6 hours on homework would be likely to generate 12 G.C.S.E. passes.

**How to cite this article: **

McLeod, S. A. (2008). Correlation. Retrieved from www.simplypsychology.org/correlation.html