Python – Linear Regression

In 2013 and 2014 (wow, already 7 years ago!) I wrote two articles about linear regression with Excel. Now, I am getting more and more interested in Python, thus I guess it would be interesting to remake the article into a python one. So, this is our input from the daily profit per week:

So, starting and loading the data looks like this:

The .reshape(-1, 1) is required, so we can produce a list of lists for the x :

Now, starting with the model, the following 2 lines do the magic:

The model is “fitted”. This means, that a line is produced, which “fits” the dots in a way, that the minimal r_sq is produced.  This is how to produce the fitted line and the scattered ponts:

The more interesting part in the Linear Regression is the “Prediction”. E.g., it is like saying – “What if we only had that tiny red line from the plot above, where would we have put our values for a given period?” And the answer is actually quite simple – “On that red line!”. This is how to do it. First generated the predicted values:

They look like this:

And they are quite different from the original values. How different? See for yourself:

Alternatively, we may use fewer lines to produce the same, without the add_subplot() part from the code above. But I guess it is less fun:

And if we want to finish with something making our article really a bit more statistical, it is these linear regression features:

  • coefficient for determination (or r^2)
  • intercept – this is the in the formula Y= a + bX)
  • slope – how much y changes for every value of x. If the slope is 7, it means that for x = [1,2,3], y = [7, 14,21], if the intercept is at 0.

The code is available here. Enjoy!

Tagged with: , , , ,