So, what exactly is standard deviation? In simple terms, it’s a measure of how spread out the numbers in a dataset are from the mean or average. If the numbers are close to the mean, the standard deviation is small. If they are spread out, the standard deviation is large.
That’s the way to calculate it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# Generate the data data = [4,8,6,5,3] # Get the mean of the data statistics.mean(data) # For each number subtract the mean and square the result # a.k.a. producing squared differences squared_differences = [] for number in data: squared_difference = (number - statistics.mean(data))**2 squared_differences.append(squared_difference) # Find the average of these squared differences avg_sq_diff = statistics.mean(squared_differences) # Take the square root of the average: math.sqrt(avg_sq_diff) # Or use np.std(data) np.std(data) |
And of course, just putting the important bell curve:
The GitHub repository with the code, used in the video below is here.