# Least Squares Polynomial Fitting to N-Data Points - 04 Aug 2018 at 21:26

## Least Squares Polynomial Fitting to N-Data Points

The least squares method can be used to fit a polynomial to a set of data points. Usually, this curve fitting will reveal something interesting about the underlying data, including yielding the possibility of predicting other values that have not been previously measured. In this article, we will fit a second-order polynomial to a set of data points, consider the maths, and then look at example code in C#.

First, the Maths

The following data points will be used in this example:

x = { -3, -2, -1, -0.2, 1, 3 }

y = { 0.9, 0.8, 0.4, 0.2, 0.1, 0 }

The first thing to do is graph these data points to get an idea of what we are working with:

The coefficients we wish to derive comprise those used in a quadratic (which is a second-order polynomial):

f(x) = ax^2 + bx + c

a ≠ 0

We wish to find a, b, and c coefficients using the measured data points. This will yield a quadratic which can then be used to fit a curve through the data points.

To find these coefficients, it is necessary to construct an equation that will solve the series of linear equations that will in turn reveal the coefficients. This equation takes the following generalised form:

Ma = b

 [ [N,sum_(i=1)^N x_i,cdots,sum_(i=1)^N x_i^k], [sum_(i=1)^N x_i,sum_(i=1)^N x_i^2,cdots,sum_(i=1)^N x_i^(k+1)], [vdots,vdots,ddots,vdots], [sum_(i=1)^N x_i^k,sum_(i=1)^N x_i^(k+1),cdots,sum_(i=1)^N x_i^(2k)] ] [ [a_0], [a_1], [vdots], [a_k] ] = [ [sum_(i=1)^N y_i], [sum_(i=1)^N x_i y_i], [vdots], [sum_(i=1)^N x_i^k y_i] ]

Putting the equation in the form required to solve the system of linear equations for a second-order polynomial:

 [ [N,sum_(i=1)^N x_i,sum_(i=1)^N x_i^k], [sum_(i=1)^N x_i,sum_(i=1)^N x_i^2,sum_(i=1)^N x_i^(k+1)], [sum_(i=1)^N x_i^k,sum_(i=1)^N x_i^(k+1),sum_(i=1)^N x_i^(2k)] ] [ [a_0], [a_1], [a_2] ] = [ [sum_(i=1)^N y_i], [sum_(i=1)^N x_i y_i], [sum_(i=1)^N x_i^k y_i] ]

Note that these matrices only have three rows as we are attempting to solve a series of linear equations for a quadratic (second-order) polynomial, and a quadratic polynomial has three coefficients.

Evaluating each element of the matrix using the data points:

sum_(i=1)^N x_i = (-3 -2 -1 -0.2 +1 + 3) = -2.2

sum_(i=1)^N x_i^k = sum_(i=1)^N x_i^2 = ((-3)^2 + (-2)^2 + (-1)^2 +(-0.2)^2 + (1)^2 + (3)^2) = 24.04

sum_(i=1)^N x_i^2 = ((-3)^2 + (-2)^2 + (-1)^2 +(-0.2)^2 + (1)^2 + (3)^2) = 24.04

sum_(i=1)^N x_i^(k+1) = sum_(i=1)^N x_i^3 = ((-3)^3 + (-2)^3 + (-1)^3 +(-0.2)^3 + (1)^3 + (3)^3) = -8.008

sum_(i=1)^N x_i^(2k) = sum_(i=1)^N x_i^4 = ((-3)^4 + (-2)^4 + (-1)^4 +(-0.2)^4 + (1)^4 + (3)^4) = 180.002

sum_(i=1)^N y_i = (0.9 + 0.8 + 0.4 + 0.2 + 0.1 + 0) = 2.4

sum_(i=1)^N x_i y_i = ((-3 * 0.9) + (-2 * 0.8) + (-1 * 0.4) + (-0.2 * 0.2) + (1 * 0.1) + (3 * 0)) = -4.64

sum_(i=1)^N x_i^k y_i = sum_(i=1)^N x_i^2 y_i = (((-3)^2 * 0.9) + ((-2)^2 * 0.8) + ((-1)^2 * 0.4) + ((-0.2)^2 * 0.2) + ((1)^2 * 0.1) + ((3)^2 * 0)) = 11.808

Here is this equation evaluated against the data points:

 [ [6,-2.2,24.04], [-2.2,24.04,-8.008], [24.04,-8.008,180.002] ] [ [a_0], [a_1], [a_2] ] = [ [2.4], [-4.64], [11.808] ]

Next, pull out the matrix from the previous equation. This will be used as a template to construct the next matrices needed:

 M = [ [6,-2.2,24.04], [-2.2,24.04,-8.008], [24.04,-8.008,180.002] ]

Three more matrices are needed (one for each coefficient) and these are constructed from the template matrix by substituting the corresponding column with the column vector b.

Matrix 0:

 M_0 = [ [2.4,-2.2,24.04], [-4.64,24.04,-8.008], [11.808,-8.008,180.002] ]

Note that the first column has been replaced by the column vector b.

Matrix 1:

 M_1 = [ [6,2.4,24.04], [-2.2,-4.64,-8.008], [24.04,11.808,180.002] ]

Note that the second column has been replaced by the column vector b.

Matrix 2:

 M_2 = [ [6,-2.2,2.4], [-2.2,24.04,-4.64], [24.04,-8.008,11.808] ]

Note that the third column has been replaced by the column vector b.

The next step is to find the determinant of each matrix. (You can read how to do this for a 3 x 3 matrix in another article.)

d = 1166.1

d_0 = 2671.2

d_1 = -1898.46

d_2 = 323.763

The final step is to calculate the coefficients using the following equations:

a_0 = d_0 / d

 => 2671.2 / 1166.1

 => 0.229066

a_1 = d_1 / d

 => -1898.46 / 166.1

 => -0.1628

a_2 = d_2 / d

 => 323.763 / 1166.1

 => 0.027764

These values comprise the elements of the column vector a, and as such also are the corresponding coefficients of the quadratic:

a = a_2

b = a_1

c = a_0

ax^2 + bx + c

 => 0.027764x + -0.1628x + 0.229066

The quadratic can be used with the original x-values to generate new y-values. Graphing the new series, we can compare how well the curve fits:

x = { -3, -2, -1, -0.2, 1, 3 }

y_1 = { 0.9, 0.8, 0.4, 0.2, 0.1, 0 }

y_2 = { 0.967342, 0.665722, 0.41963, 0.26273656, 0.09403, -0.009458 }

And as you can see, it's a pretty decent approximation.

The Code

The maths was the tricky part, now comes the easy part: implementing the maths in code.

At time of writing, the method that produces a quadratic from a series of data points that I am using in my post-grad studies looks like this:


{

int k = 2;   //  Size of order

//  Matrix size must be k + 1
int size = (k + 1);

List<Coordinate2D<decimal>> decimalCoordinate2Ds = new List<Coordinate2D<decimal>>();

//  N               SumOf(x_i)          ...     SumOf(x^k_i)
//  SumOf(x_i)      SumOf(x^2_i)        ...     SumOf(x^(k + 1)_i)
//     :                :                :        :
//  SumOf(x^k_i)    SumOf(x^(k+1)_i)    ...     SumOf(x^(2k)_i)

Matrix<decimal> matrix = new Matrix<decimal>(size);
matrix[0, 0] = decimalCoordinate2Ds.Count;
matrix[0, 1] = decimalCoordinate2Ds.Sum(m => (m.X));
matrix[0, 2] = decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(k)));

matrix[1, 0] = decimalCoordinate2Ds.Sum(m => (m.X));
matrix[1, 1] = decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(2)));
matrix[1, 2] = decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(k + 1)));

matrix[2, 0] = decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(k)));
matrix[2, 1] = decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(k + 1)));
matrix[2, 2] = decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(2 * k)));

//  SumOf(y_i)
//  SumOf(x_i * y_i)
//     :
//  SumOf(x^k_i * y_i)

decimal[] columnVector = new decimal[]
{
decimalCoordinate2Ds.Sum(m => (m.Y)),
decimalCoordinate2Ds.Sum(m => (m.X * m.Y)),
decimalCoordinate2Ds.Sum(m => (m.X.ToExponent(k) * m.Y))
};

//  SumOf(y_i)          SumOf(x_i)          ...     SumOf(x^k_i)
//  SumOf(x_i * y_i)    SumOf(x^2_i)        ...     SumOf(x^(k + 1)_i)
//     :                    :                :        :
//  SumOf(x^k_i * y_i)  SumOf(x^(k+1)_i)    ...     SumOf(x^(2k)_i)

Matrix<decimal> matrix0 = matrix.FromMatrix();
matrix0.ReplaceColumn(0, columnVector);

//  SumOf(y_i)          SumOf(x_i)          ...     SumOf(x^k_i)
//  SumOf(x_i * y_i)    SumOf(x^2_i)        ...     SumOf(x^(k + 1)_i)
//     :                    :                :        :
//  SumOf(x^k_i * y_i)  SumOf(x^(k+1)_i)    ...     SumOf(x^(2k)_i)

Matrix<decimal> matrix1 = matrix.FromMatrix();
matrix1.ReplaceColumn(1, columnVector);

//  SumOf(y_i)          SumOf(x_i)          ...     SumOf(x^k_i)
//  SumOf(x_i * y_i)    SumOf(x^2_i)        ...     SumOf(x^(k + 1)_i)
//     :                    :                :        :
//  SumOf(x^k_i * y_i)  SumOf(x^(k+1)_i)    ...     SumOf(x^(2k)_i)

Matrix<decimal> matrix2 = matrix.FromMatrix();
matrix2.ReplaceColumn(2, columnVector);

decimal matrixDeterminant = matrix.Determinant();
decimal matrix0Determinant = matrix0.Determinant();
decimal matrix1Determinant = matrix1.Determinant();
decimal matrix2Determinant = matrix2.Determinant();

if (matrixDeterminant != 0)
{
//  y = ax^2 + b2 + c

decimal a = (matrix2Determinant / matrixDeterminant);
decimal b = (matrix1Determinant / matrixDeterminant);
decimal c = (matrix0Determinant / matrixDeterminant);

}