What is Loss Function
A loss function in machine learning is a mathematical function that measures the difference between the predicted output of a model (y^) and the actual output (or true label, y). The purpose of the loss function is to quantify how well or poorly a model is performing by comparing the predicted values against the actual values.
Types of Loss Functions
There are different types of loss functions used depending on the type of machine learning problem. Mean Squared Error (MSE) will be an example to introduce, which measures the average squared difference between the predicted and actual values. It’s commonly used in regression problems.
MSE=n1i=1∑n(y^i−yi)2 Example
Assuming that the previous case of house price prediction will be discussed in detail. The model and cost function are given in the table below. In order to make it more clear and easy to understand, the b has been assigned to 0. Therefore, function fw,b(x)=wx+b has become to fw(x)=wx
Assume that there are three known data, which are (1,1), (2,2), and (3,3) if expressed in (x,y). The input value is on the x-axis and the output value is on the y-axis. Then let w equals to 0, 0.5 and 1 and do the further calculation (while w can be any value).
The calculation detail:
∵J(w)=2n1i=1∑n(wxi−yi)2,w=1 ∴J(0)=61i=1∑3(xi−yi)2=6(1−1)2+(2−2)2+(3−3)2=0 J(0.5)=61i=1∑3(xi−yi)2=6(0.5−1)2+(1−2)2+(1.5−3)2≈0.58 J(1)=61i=1∑3(xi−yi)2=612+22+32≈2.3 The visualization of calculation result
What is other ways to calculate the w, instead of trying with numbers on by one. (Speculation)
Definition of the Loss Function
Assuming b=0, the loss function J(w) is defined as:
J(w)=n1i=1∑n(wxi−yi)2 where n=3, because we have three data points.
Substituting Data Points
Substitute the data points (1,1), (2,2), and (3,3) into the loss function:
J(w)=31[(w⋅1−1)2+(2w−2)2+(3w−3)2] This can be further simplified to:
J(w)=31[(w−1)2+(2w−2)2+(3w−3)2] Taking the Derivative with Respect to w
Differentiate with respect to w and set the derivative to zero:
∂w∂J(w)=32[(w−1)⋅1+(2w−2)⋅2+(3w−3)⋅3]=0 Expand and simplify:
∂w∂J(w)=32[(w−1)+4(w−1)+9(w−1)]=32×14×(w−1)=0 Solving gives:
w=1
Conclusion
The optimal value of w is 1, so the ideal regression model (function) is:
f(x)=1⋅x