Fitting linear trends

The most important reason for calculating a linear trend is in order to remove it. One of the major problems in using a finite sequence of data to represent a source that is effectively infinitely long is that the process of chopping off the ends is a distortion. One feature of this is the Window Problem in the Fourier Transform. The other is the fact that, even when there is no linear trend in the original process, there is always one in the finite block of data taken to represent it. This is, in effect, a spurious sawtooth waveform that has been added to the original one, because of the cyclic nature of the discrete transform, which adds spurious spectral components.

A secondary reason for calculating the linear trend is to support a theory that requires it. Since there is inevitably a spurious linear component in the finite sample of data, this can be more tendentious.

Here is a line fitted to five points in the traditional way. It is the one line of all possible lines that is closest to all the points while remaining straight. The sense in which it is closest is known as least squares, because it is calculated to minimise the sum of squares of all the vertical distances from the line to the points. The line tells us something about the five points, but it tells us nothing about the source from whence they came. In fact they came from a mathematical model that has no linear trend, a sine wave.

Here is the complete sine wave for a number of cycles. It is an oscillatory process that goes on forever in the positive and negative direction. Its average is zero and the best straight line fitted to it is the x axis. The two pictures illustrate the greatest hazard of trend estimation. The trend is a property of the data points we have and not of the original process from which they came. As we can never have an infinite number of readings there is always an error introduced by using a restricted number of data points to represent a process in the real world. Naturally the error decreases rapidly as we increase the number of data points, but it is always there and, in fact, the calculated trend is never zero, even if the original process is stationary, as is our sine wave. We have created a model which is a deterministic process that has no inherent linear trend. Let us pursue this by making the simplistic assumption that annual temperatures are cyclic with no random variations. For no particular reason, we can make the period equal to the period of the sunspot cycle (eleven years) and the amplitude one degree of temperature. It is easy to generate the sine wave and then fit a straight line to the first 3, 4 ….100 points. The figure shows the plot of the slope of the best-fit straight line against the number of points used to estimate it.

We can see that, even though the original process has no linear trend, the best fit straight line shows one, though it decreases rapidly and cyclically with the number of points. Even after forty years adjacent points are different in the second decimal place; yet trends have been published to a much higher precision. A major problem is the end effect, which relates to the huge changes in apparent slope that can be wrought just by the choice of where to start the data selection. To illustrate this, we can repeat the above exercise, but phase-shift the sine wave so that it starts at minus one degree of temperature. The slopes of the lines are then dramatically different. The reason for this is that in the calculation of the slope, the contribution of each data point is weighted according to its distance from the centre (see mathematical explanation below). Of course, data in the real world are not deterministic, so we can go to the other extreme and examine the case where temperatures are random (normal) and uncorrelated, with an RMS magnitude of one degree. Again there is no trend in the original process, but we can repeat the exercise of taking more and more points to calculate the slope of the best fit straight line. Naturally, in this situation every outcome is different for each particular realisation of the process, but the figure below  shows a typical result. Again, even after thirty years, there are substantial variations in the third decimal place of degrees. Note that these slopes are per year. Some researchers multiply this by ten to get a per decade figure, so the variations are multiplied pro rata. Thus even two decimal places of precision is pushing it after thirty years.

In the real world observations are neither uncorrelated nor deterministic. There are random elements combined with oscillations of a local nature. Sequences of high and low annual temperatures tend to occur in groups. By employing the end effect distortion at both ends, it is possible to create an apparent real trend out of nothing.

Annual temperatures have been used in these illustrations because they provide some of the most contentious examples of trend estimation, but the same principles apply to all cases where straight lines are fitted to data.

A fine example of the genre in another application can be found here.

Mathematical footnote:

The slope of the best fit straight line is obtained by dividing the covariance of the data by the variance of the independent variable. Since any particular ordinate only appears once in the summation it is easy to show by partial differentiation how each ordinate is weighted by its distance from the middle. The variance for the first N integers tends towards N2/12, so there is also a substantial weighting according to the number of data. 