Logistic regression andrew ng.
I've been trying to finish Andrew Ng's Machine Learning course, I am at the part about logistic regression now. I am trying to discover the parameters and also calculate the cost without using the MATLAB function fminunc . However, I am not converging to the correct results as posted by other students who have finished the assignment using fminunc . Specifically, my problems are: the parameters theta are incorrect my cost seems to be blowing up I get many NaN s in my cost vector (I just create a vector of the costs to keep track) I attempted to discover the parameters via Gradient Descent as how I understood the content. However, my implementation still seems to be giving me incorrect results. The parameters theta that I get are: The results below is from one example that I downloaded, but the author used fminunc for this one. I ran your code and it does work fine.

However, the tricky thing about gradient descent is ensuring that your costs don't diverge to infinity. If you look at your costs array, you will see that the costs definitely diverge and this is why you are not getting the correct results. The best way to eliminate this in your case is to reduce the learning rate. Through experimentation, I have found that a learning rate of alpha = 0.003 is the best for your problem. I've also increased the number of iterations to 200000 . Changing these two things gives me the following parameters and associated cost: This is more or less in line with the magnitudes of the parameters you see when you are using fminunc . However, they get slightly different parameters as well as different costs because of the actual minimization method itself. fminunc uses a variant of L-BFGS which finds the solution in a much faster way. What is most important is the actual accuracy itself.

Remember that to classify whether an example belongs to label 0 or 1, you take the weighted sum of the parameters and examples, run it through the sigmoid function and threshold at 0.5. We find what the average amount of times each expected label and predicted label match. Using the parameters we found with gradient descent gives us the following accuracy: This means that we've achieved an 89% classification accuracy. Using the labels provided by fminunc also gives: So we can see that the accuracy is the same so I wouldn't worry too much about the magnitude of the parameters but it's more in line with what we see when we compare the costs between the two implementations. As a final note to you, I would suggest looking at this post of mine for some tips on how to make logistic regression work over long-term. I would definitely recommend normalizing your features prior to finding the parameters to make the algorithm run faster. It also addresses why you were finding the wrong parameters (namely the cost blowing up): Cost function in logistic regression gives NaN as a result.
Комментариев нет:
Отправить комментарий