вторник, 30 июля 2019 г.

Does logistic regression require independent variables to be normal distributed 1

Logistic normal distribution.

Does logistic regression require independent variables to be normal distributed
Still have a question? Ask your own! ad by Commonlounge. OK. So I looked the "professor" in question up. The link you posted went to Data Science Central. Some people there are pretty good, but others. well, I can tell some stories, but I won't. She's a part time lecturer, with no recent classes (apparently) at George Washington University. And she was a PhD candidate (ABD), but for less than a year. That means she wasn't ABD (all but dissertation), since she likely didn't do (or perhaps didn't pass?) her comprehensives. (To be a professor, by the way, you have to have a PhD or other terminal degree.) Really, no regression model requires normality of independents or dependent variable. simply conditional distribution (on the design matrix) of the errors, contrary to my previous statement. Edit 3 (2 was replaced) : Peter Flom has corrected me once again, and actually, I misrepresented what he said the first time. Apologies to everyone, especially you, Peter , for that. Somehow I was mixing up my conditional probability. (If any of my old students from my PhD tutoring days are reading this, and you're one of the ones whom I made the deal with about me forgetting something "that should always be back in the back of your mind": since this is a public forum, yes, I owe you all one lesson's refund. You all know where I live. virtually, at least. As in, you have my email address.) Further edit, Sept 25, 2016: I was thinking about something related and remembered my answer to this question, and thought I’d correct it. The conditional normality of the errors is not technically required for OLS regression to be valid, but for various finite sample properties to hold, as well as making the OLS estimates maximum likelihood estimates. This kind of thing is necessary for certain inferential results , but not, for example, for good predictions to be possible. Of course, it helps prediction when this is true, but it’s not necessary . This is often poorly understood. Edit 1: I had originally said outliers had no effect on SVMs, which may not be completely true. It depends on where these points are, the kernel, etc. It really depends, then, on the entire formulation of the problem if atypical points affect the performance of an SVM. Continued: SVMs are more often seen as a geometric optimization problem (motivated from a probability bound on VC dimension), so, in a sense, the word outlier has little meaning in this context . Outliers are usually considered with respect to a given distribution. Although, to be completely fair, the word outlier is often used because SVMs are lumped with other classification algorithms. Don't generally trust stuff you read on Data Science Central. (Another chestnut in the article is that random forests make no assumptions on the data. which is false. Random forests are built on trees, which make the assumption that splitting the data space into hyper-rectangles is a good idea.)

Комментариев нет:

Отправить комментарий