Using SVD to Obtain Regression Lines

Singular Value Decomposition (SVD) is a factorization of a real or complex matrix, which can then be used in various methods. If matrix M is a real m x n matrix, where m > n we can obtain the following from factorization:

In the formula above,

• U represents the unitary matrix, which defined as the orthogonal matrix in a real matrix, according to wolfram:

Each row has length one, and their Hermitian inner product is zero.

• The Σ represents diagonal matrix, which are simply the variables on the main diagonal of the original matrix M.
• V* represents the conjugate transpose of a unitary matrix of matrix M.

It has been known that SVD can be used in regression analysis since the early 1980’s [1]. This example is intended to demonstrate how to do so in python. I previously did an example where I found a Linear Regression using a more standard method. I will be using the same data, here are the results side-by-side:

Both ways of determining a linear regression line have nearly identical results.

Programming Singular Value Decomposition

As opposed to factorizing the A matrix yourself I would highly recommend decomposing the matrix with numpy, ‘r’ represents the regression line (the first value represents the slope, the second the starting point) like so:

This is relatively straight forward and with numpy it becomes trivial to program. The ‘r’ array I obtained out of this function with the above data was:

[0.90888948  0.11567684]

or

y = (0.90888948)x + 0.11567684

If we then wish graph the linear regression line all we have to do is iterate over each x value and output a y:

If we then graph the line, along with the various points we obtain the following:

If we compare the above with the graph I previously made for the more standard method:

They are nearly identical, as expected. If you would like to test the results yourself or fiddle with the code feel free to visit my github and download/fork the code.